Github repo for kaggle quora question similarity problem
The data size for this compoetition is too large to host in GitHub directly. All notebooks in this repo expect the following data to be available on your machine, in the following directories:
data/
- sample_submission.csv - from the competition website
- test.csv - from the competition website
- train.csv - from the competition website
- wikitext-103/ - from the wikitext dataset
- wiki.test.tokens
- wiki.valid.tokens
- wiki.train.tokens
- wikitext-2/ - from the wikitext dataset
- wiki.test.tokens
- wiki.train.tokens
- wiki.valid.tokens
- wikitext-103-raw/ - from the wikitext dataset
- wiki.test.raw
- wiki.valid.raw
- wiki.train.raw
- wikitext-2-raw/ - from the wikitext dataset
- wiki.test.raw
- wiki.train.raw
- wiki.valid.raw