winkjs / wink-sentiment Goto Github PK

View Code? Open in Web Editor NEW

61.0 6.0 13.0 1.66 MB

Accurate and fast sentiment scoring of phrases with #hashtags, emoticons :) & emojis 🎉

Home Page: http://winkjs.org/wink-sentiment

License: MIT License

JavaScript 100.00%

sentiment-analysis sentiment sentiment-classification nlp sentiment-scores wink emoji emoticons hashtag

wink-sentiment's Introduction

wink-sentiment

Accurate & fast sentiment scoring of phrases with #hashtags, emoticons:) & emojis🎉

Analyze sentiment of tweets, product reviews, social media content or any text using wink-sentiment. It is based on AFINN and Emoji Sentiment Ranking; it's features include:

Intelligent negation handling; for example, phrase "good product" will get a positive score whereas "not a good product" gets a negative score.
Automatic detection and scoring of two-word phrases in a text; for example, "cool stuff", "well done", and "short sighted".
Processes each emoji, emoticon and/or hashtag separately while scoring.
Embeds a powerful tokenizer that returns the tokenized phrase.
Returns the sentiment score and tokens. Each token contains a set of properties defining its sentiment, if any.
Achieves accuracy of 77%, when validated using Amazon Product Review Sentiment Labelled Sentences Data Set at UCI Machine Learning Repository.

Installation

Use npm to install:

npm install wink-sentiment --save

Getting Started

// Load wink-sentiment package.
var sentiment = require( 'wink-sentiment' );
// Just give any phrase and checkout the sentiment score. A positive score
// means a positive sentiment, whereas a negative score indicates a negative
// sentiment. Neutral sentiment is signalled by a near zero score.

// Positive sentiment text.
sentiment( 'Excited to be part of the @imascientist team:-)!' );
// -> { score: 5,
//      normalizedScore: 2.5,
//      tokenizedPhrase: [
//        { value: 'Excited', tag: 'word', score: 3 },
//        { value: 'to', tag: 'word' },
//        { value: 'be', tag: 'word' },
//        { value: 'part', tag: 'word' },
//        { value: 'of', tag: 'word' },
//        { value: 'the', tag: 'word' },
//        { value: '@imascientist', tag: 'mention' },
//        { value: 'team', tag: 'word' },
//        { value: ':-)', tag: 'emoticon', score: 2 },
//        { value: '!', tag: 'punctuation' }
//      ]
//    }

// Negative sentiment text.
console.log( sentiment( 'Not a good product :(' ) );
// -> { score: -5,
//      normalizedScore: -2.5,
//      tokenizedPhrase: [
//        { value: 'Not', tag: 'word' },
//        { value: 'a', tag: 'word', negation: true },
//        { value: 'good', tag: 'word', negation: true, score: -3 },
//        { value: 'product', tag: 'word' },
//        { value: ':(', tag: 'emoticon', score: -2 }
//      ]
//    }

// Neutral sentiment text.
console.log( sentiment( 'I will meet you tomorrow.' ) );
// -> { score: 0,
//      normalizedScore: 0,
//      tokenizedPhrase: [
//        { value: 'I', tag: 'word' },
//        { value: 'will', tag: 'word' },
//        { value: 'meet', tag: 'word' },
//        { value: 'you', tag: 'word' },
//        { value: 'tomorrow', tag: 'word' },
//        { value: '.', tag: 'punctuation' }
//      ]
//    }

Try experimenting with this example and more on Runkit in the browser.

Documentation

Check out the wink sentiment API documentation to learn more.

Need Help?

If you spot a bug and the same has not yet been reported, raise a new issue or consider fixing it and sending a pull request.

About wink

Wink is a family of open source packages for Statistical Analysis, Natural Language Processing and Machine Learning in NodeJS. The code is thoroughly documented for easy human comprehension and has a test coverage of ~100% for reliability to build production grade solutions.

Copyright & License

It is licensed under the terms of the MIT License.

wink-sentiment's People

Contributors

Stargazers

Watchers

Forkers

sanjayaksaxena rachnachakraborty hannujaakkola prtksxna cnxtech lalitgill jdmunro vatsal2210 lisennk vitaly-z

wink-sentiment's Issues

change normalized score computation method

use average of words who have sentiment score associated with them;
normalize this average with average sentence length (assume 15 words);
(hashtag score + this normalize score)/2 will be the final normalized score.

update copyright notice year in README & License Header

Add n't to negation

Refer to comments in #12.

try to determine hashtag's sentiment

hashtag is automatically identified by wink-tokenizer; remove the # and extract the balance word and lookup the score — this itself should be handle simple hashtags such as #win or #fail, etc.

We can even try regex to extract multiple words to perform look up separately for each extracted word; for example #FailedProduct can be split using /[A-Z]+[a-z]*|[0-9]+/g regex — this will on match yield [ 'Failed', 'Product' ] —> will result in a score of -2. If there are >1 sentiment words, then average of sentiment only words should be taken as the score.

Since hashtag is more like the category of the text, the final score from hashtags must be added with the final sentiment score of the text and the total should be divided by 2 to ensure hashtag gets its due credit!

change "homepage" in package.json to winkjs.org

update wink-tokenizer dependency

Support for multi lang

Do you have any plan to support multilang ?

update dependencies

The tokenizer has moved to 3.2.0 from 3.0.0.

upgrade wink-tokenizer -> 2.3.1

update dev dependencies

compute of sentiment scores of phrases properly

tokenizing working incorrectly in webpack/create-react-app production build

Hi! Thanks so much for creating this library! I was trying it out by creating a very simple React application that will do sentiment analysis for a given text input as the input changes.

The app is here -- https://react-sentiment-analyzer.netlify.com/ -- and/but although it's working pretty well on localhost in development, when I create a production build using the webpack setup in Create-React-App it appears that the way wink-sentiment tokenizes the input string doesn't work the same way it does in development. Essentially it seems to be splitting words into smaller pieces than it does in development, which makes me think that I may be importing something incorrectly.

For example, here's the local development version which shows the wink-sentiment output for the word "angry" via Redux-devtools on the right:

Note that it's parsing it all as a single word.

As a contrast, this is the Redux devtools output for the same word on https://react-sentiment-analyzer.netlify.com/ in a production build:

Here it's parsing the word "angry" into two tokens: a and ngry, and thereby not outputting the same sentiment.

The code I'm using to run wink-sentiment is https://github.com/kellyi/react-sentiment-analyzer/blob/master/src/utils/sentimentAnalyzer.js ->

import sentiment from 'wink-sentiment';

const sentimentAnalyzer = text => sentiment(text);

export default sentimentAnalyzer;

Do you have any idea why wink-sentiment might be tokenizing things one way in development and another in production?

Thanks again for creating this library!

include fix details js doc

refer to pr #3