Sentiment Analysis is performed on twitter data obtained.
For Analysis using NLP and ML:
Check the distribution of tweets in the data. It is to be noted we have test and train data in hand. Define a
Countvectorizer function to convert text into a represen- tation of corresponding frequency.
A word cloud is then defined to show the most frequent word with bigger and distinct letters, and with different colors.
Then, we collect hashtags by extracting them from certain types of tweets.
We label the tweets collected and remove unwanted patterns from them.
Stemming is applied to extract the stem and join them back.
A bag of words model is then created and thus extracting all key features.
After this, we ideally want to apply our ML model.For this, we first split the training data into train and valid sets so that we can evaluate the performance of the model we apply.
Four ML models are then applied, namely: Random Forest; Decision Tree; Logistic Regression, and XBG.
A confusion matrix is obtained and we check the accuracy associated with each model.
For GUI based Sentiment Anlayser:
For developing the web application, we have used “Flask” package, which is basically a web
application framework.
Run in Ubunutu
Naives Bayes has been used for classifying on the basis of positive and negative sentiment(sentence).