DataLoader for Seq2seq
Efficient data loader for text dataset using torch.utils.data.Dataset, collate_fn and torch.utils.data.DataLoader.
Prerequesites
Usage
1. Clone the repository
$ git clone https://github.com/yunjey/seq2seq-dataloader.git
$ cd seq2seq-dataloader
2. Download nltk tokenizer
$ pip install nltk
$ python
$ import nltk
$ nltk.download('punkt')
3. Build word2id dictionary
$ python build_vocab.py
4. Check DataLoader
For usage, please see example.ipynb.