We use the training dataset obtained from CoNLL-2000 and word embeddings from wiki-news-300d-1M. To download the datatset and embeddings, run download.sh
- Our implementation of machine learning models,
- Logisitic Regression (LR) can be found in LR-notebook.ipynb
- Multi-Layer Perceptron (MLP) can be found in MLP.ipynb
- Hidden Markov Models (HMM) can be found in HMM.ipynb
- Ensemble of the above models can be found in ensemble_model.ipynb
- src folder has our common utils and functions to process data
- linguisticlions.test.txt is Labeled test data by our best/final model. Labelled data by other models can be found in Labelled_outputs