Git Product home page Git Product logo

cove's Introduction

Context Vectors (CoVe)

This repo provides the best MT-LSTM from the paper Learned in Translation: Contextualized Word Vectors (McCann et. al. 2017). For a high-level overview of why CoVe are great, check out the post.

example.py uses torchtext to load the Stanford Natural Language Inference Corpus and GloVe.

It uses a PyTorch implementation of the MTLSTM class in mtlstm.py to load a pretrained encoder, which takes in sequences of vectors pretrained with GloVe and outputs CoVe.

A Keras/TensorFlow implementation of the MT-LSTM/CoVe can be found at https://github.com/rgsachin/CoVe.

Unkown Words

Out of vocabulary words for CoVe are also out of vocabulary for GloVe, which should be rare for most use cases. During training the CoVe encoder would have received a zero vector for any words that were not in GloVe, and it used zero vectors for unkown words in our classification and question answering experiments, so that is recommended.

You could also try initializing unkown inputs to something close to GloVe vectors instead, but we have no experiments suggesting that this would work better than zero vectors. If you wanted to try this, GloVe vectors follow (very roughly) a Gaussian with mean 0 and standard deviation 0.4. You could initialize by randomly drawing from that distrubtion, but you would probably want to train those embeddings while keeping the CoVe encoder (MTLSTM) and GloVe fixed.

There is also the third option if you are operating in an entirely different context -- retrain the bidirectional LSTM using trained embeddings. If you are mostly encoding a non-English language, that might be the best option. Check out the paper for details.

Running with Docker

Install Docker. Install nvidia-docker if you would like to use with with a GPU.

docker pull bmccann/cove   # pull the docker image
docker run -it cove        # start a docker container
python /cove/test/example.py

Running without Docker

Install PyTorch.

git clone https://github.com/salesforce/cove.git # use ssh: [email protected]:salesforce/cove.git
cd cove
pip install -r requirements.txt
python setup.py develop
python test/example.py

References

If using this code, please cite:

B. McCann, J. Bradbury, C. Xiong, R. Socher, Learned in Translation: Contextualized Word Vectors

@article{McCann2017LearnedIT,
  title={Learned in Translation: Contextualized Word Vectors},
  author={Bryan McCann and James Bradbury and Caiming Xiong and Richard Socher},
  journal={arXiv preprint arXiv:1708.00107},
  year={2017}
}

Contact: [email protected]

cove's People

Contributors

bmccann avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.