Git Product home page Git Product logo

awessome's Introduction

AWESSOME

A Word Embedding Sentiment Scorer Of Many Emotions (AWESSOME) is a framework with the purpose of predicting the sentiment intensity of words and sentences (e.g phrases, tweets).

AWESSOME relies on sentiment seed-words and word embedding, where the similarity between the vector representation of two sentences is considered as a reflection of their sentiment similarity.

AWESSOME capitalizes on pre-existing lexicons ([VADER](https://github.com/cjhutto/vaderSentiment) , [LabMT](https://trinker.github.io/qdapDictionaries/labMT.html)), but custom lexicons can also be used, and created using AWESSOME.

AWESSOME also draws upon the recent advances in language model by using the Transformers from HuggingFace, to create word embeddings using BERT, RoBERTa, etc.

AWESSOME is scalable, and does not require any training data, while providing more fine grained (and accurate) sentiment intensity scores of words, phrases and text.

Citation Information

If you use the AWESSOME sentiment analysis tools in your research, please cite the following paper. For example (to be added):

Amal Htait & Leif Azzopardi. AWESSOME: An UnsupervisedSentiment Intensity Scoring Frameworkusing Neural Word Embeddings. In ECIR 2021.

Installation

To install AWESSOME:

  1. The simplest is to use the command line to do an installation from [PyPI] using pip, e.g.,
    > pip install awessome
  2. If you already have AWESSOME and simply need to upgrade to the latest version, e.g.,
    > pip install --upgrade awessome
  3. You could also clone this [GitHub repository]
  4. You could download and unzip the [full master branch zip file]

In addition to the AWESSOME Python module, you will also be downloading two lexicon dictionaries ([VADER](https://github.com/cjhutto/vaderSentiment) , [LabMT](https://trinker.github.io/qdapDictionaries/labMT.html)).

Python Demo and Code Examples

The AWESSOME framework can be flexibility adapted to cater for different seed lexicons and different neural word embeddings models in order to produce corpus specific lexicons without the need for extensive supervised learning and retraining.

Through parameters, AWESSOME gives the possibility to:

  1. Choose between different available pre-trained language models, such as: BERT (bert-base-nli-mean-tokens) and Distilbert (distilbert-base-nli-stsb-mean-tokens) Note: some pre-trained language models would need GPU.
  2. Employ different aggregation methods on the similarity scores of the sentence with each term in the seeds lists: Average (avg), Maximum (max) and Sum (sum).
  3. Select one of two possible similarity measures, provided by [scipy](https://www.scipy.org/): cosine and euclidean. Note: if no similarity measure is provided, cosine is applied as a default measure.
  4. Select a source of positive and negative seeds lists, where the user can provide a new lexicon file, or used the pre-built lexicons: vader or labmt (built based on VADER and LabMT sentiment lexicons). Note: if no lexicon file is provided, vader is applied as a default seeds lists source.
  5. Choose the size of seeds lists, created based on the lexicon files. Note: if no size is provided, the value of 500 is used as default seeds lists size.
  6. In addition, AWESSOME gives the possiblity to apply a "Weighted Similarity" to seeds, by multiplying the similarity score by the sentiment score of the seeds. users have the option to use that feature of note by simply choosing "weighted" as True or False. Note: if the weighted value is not provided, it is considered by default as False.

An example Demo is added under the name of : awessome_demo.py

awessome's People

Contributors

amalhtait avatar leifos avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.