Git Product home page Git Product logo

pytorch-nlp-tutorial-nyc2017's Introduction

pytorch-nlp-tutorial

Day 1

Day 1 Slides

Day 1 Data

1. mkdir day_1/data
2. copy contents of this drive in the data folder day_1/data

https://drive.google.com/open?id=0B1sSP-aCtfuHRFJWTkdUbjFUZDQ

3. Download glove and unpack contents in day_1/data/glove
http://nlp.stanford.edu/data/glove.6B.zip

Day 2

Day 2 Slides

Day 2 Data

  1. Trump Tweets
  2. Not-pruned Names dataset a. Train b. Test c. Day One Version
  3. Stanford NLI dataset
  4. Amazon Reviews small train
# install anaconda (if needed)

conda create -n dl4nlp python=3.6
source activate dl4nlp
conda install ipython
conda install jupyter
python -m ipykernel install --user --name dl4nlp

# install pytorch
# visit pytorch.org

# assume we are inside a folder dl4nlp
# note: that if you alternatively download the zip and unzip it to
#   a folder, it will be named something else
git clone https://github.com/joosthub/pytorch-nlp-tutorial.git
cd pytorch-nlp-tutorial

pip install -r requirements.txt

# going back to root folder
cd ..

# install torch text
git clone https://github.com/pytorch/text.git
cd text
python setup.py install

pytorch-nlp-tutorial-nyc2017's People

Contributors

braingineer avatar delip avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pytorch-nlp-tutorial-nyc2017's Issues

Add models for slow learning

Add pre-computed models that have already been computed so students could load the models for machines without cuda

escape % in string

For 4_Chinese_document_classification, I had to add an escape on the '%' when defining the chars variable in "2. Build vocab" to use it when implementing a solution using the names.

names_train.csv and names_test.csv entries

Some of the rows have no name, but a language.

In names_train.csv: line 983, 2148, 2991, 3248, 4494, 4904, 5972, 7260, 7955, 8122, 9779, 10408, 10696

In names_test.csv: line 983, 2148

datautils import path (day 2)

To add a leading . in the path like

from .datautils.vocabulary import Vocabulary

to avoid conflict with the one in the public repo.

also, datautils.misc is missing in
day_2/02_Names_CharNN_Nationality_Classifier.ipynb

01_Trump_Tweet_LM mods

  • Needed to add "import json" to perform save in TweetLanguageModel and TrumpTweetVectorizer
  • Needed to call "model.cuda()" to perform training
  • " ".join(greedy_sample_from(model, vectorizer, temperature=0.9, n_length=30, use_cuda=False)[1:]) needs to be moved below creation of "model" object.

02_Names_CharNN_Nationality_Classifier

  • in NamesClassifier class: "last_item_indices += torch.arange(0, x_in.size(0)).long().cuda() * x_in.size(1)" needs a non-gpu version (and should be pushed into the "if use_gpu" check.
  • need "import json" for save

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.