Repository containing portfolio of NLP projects completed by me for academic, self learning, and hobby purposes. Presented in the form of iPython Notebooks.
2-way polarity (positive, negative) classification system for reviews, using NLTK's sentiment analysis engine.
Tools : NLTK, scikit-learn
Used a logistic regression, naive bayes and SVM classifier, bag-of-words features, and polarity lexicons (both in-built and external). We'll also create our own UDF to clean raw text present in the form of reviews.
- Review in the form of free text was scrapped and the user rating
- A user rating of 1,2,3 -> sentiment 0 -> negative sentiment
- A user rating of 4 and 5 -> sentiment 1 -> positive sentiment
- Getting some visuals on text data
- Wordcloud
- bargraph
- Frequency graph
- Text cleaning tasks
- Extarct features from text and convert text to numbers
- n-gram analysis -> bigram, trigrams, obtain visuals on the n-grams
- Sentiment analysis using AFFIN and VADER
- Document classification
- Document clustering
- Document and word similarlity