💭 Aspect-Based-Sentiment-Analysis: Transformer & Explainable ML (TensorFlow)

License: Apache License 2.0

Python 100.00%

aspect-based-sentiment-analysis bert-embeddings deep-learning distill explainable-ai explainable-ml interpretability machine-learning sentiment-analysis tensorflow transformer-models transformers

aspect-based-sentiment-analysis's Issues

Can't execute the example

Hi!

I tried to excute

import aspect_based_sentiment_analysis as absa

nlp = absa.load()
text = ("We are great fans of Slack, but we wish the subscriptions "
        "were more accessible to small startups.")

slack, price = nlp(text, aspects=['slack', 'price'])
assert price.sentiment == absa.Sentiment.negative
assert slack.sentiment == absa.Sentiment.positive

but I only get

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-1-0cd7d305f4c1> in <module>
----> 1 import aspect_based_sentiment_analysis as absa
      2 
      3 nlp = absa.load()
      4 text = ("We are great fans of Slack, but we wish the subscriptions "
      5         "were more accessible to small startups.")

~\Miniconda3\lib\site-packages\aspect_based_sentiment_analysis\__init__.py in <module>
      2 __version__ = "2.0.1"
      3 
----> 4 from .alignment import tokenize
      5 from .alignment import make_alignment
      6 from .alignment import merge_tensor

~\Miniconda3\lib\site-packages\aspect_based_sentiment_analysis\alignment.py in <module>
      7 import numpy as np
      8 
----> 9 from .data_types import TokenizedExample
     10 
     11 

~\Miniconda3\lib\site-packages\aspect_based_sentiment_analysis\data_types.py in <module>
    159 
    160 
--> 161 @dataclass(frozen=True)
    162 class InputBatch:
    163     """ The model uses these tensors to perform a prediction.

~\Miniconda3\lib\site-packages\aspect_based_sentiment_analysis\data_types.py in InputBatch()
    170     indicate first and second portions of the inputs, zeros
    171     and ones. """
--> 172     token_ids: tf.Tensor
    173     attention_mask: tf.Tensor
    174     token_type_ids: tf.Tensor

AttributeError: module 'tensorflow' has no attribute 'Tensor'

Do I need an older Tensoflow version?

Thank you!

Best regards
Robert

Can't install the package on Ubuntu 18.04 with Python 3.7.5

I am doing the following:

sudo pip3 install aspect-based-sentiment-analysis

Failing with tensorflow:

Collecting tensorflow>=2.1 (from aspect-based-sentiment-analysis)
Could not find a version that satisfies the requirement tensorflow>=2.1

Please advise how can I install it on Ubuntu 18.04 with Python 3.7.5.

I can do it smoothly on Windows 10, but I need it on my work laptop.

I am trying to use the pretrained 'absa/classifier-rest-0.2' model for generating labels for approximately 300K reviews. In the README examples, I could not find a way to provide batches of data (batches or list of reviews) to the model. Feeding batches would allow me to potentially speed up the inference process.

I am merely looking to generate labels from your model as a means for a different analysis. There is no intention to train as my task is exactly what you model has been trained on.

Any other ideas on how to speed up the inference process would also be welcome!

can't install the package on a clean conda env

Here is the error I've got when trying to pip install the package:
Not to sure what to do to make it work, any suggestions please ?

Thanks

Collecting aspect-based-sentiment-analysis
  ERROR: Could not find a version that satisfies the requirement aspect-based-sentiment-analysis (from versions: none)
ERROR: No matching distribution found for aspect-based-sentiment-analysis

Update transformers library to 4.2.0

Please, can you update this repository to latest transformers lib 4.2.0?

How to initialize with random weights or original Bert Pre trained weights

FutureWarning: The `pad_to_max_length` argument is deprecated

Using the aspect-based sentiment analysis model classifier-rest-0.1 , results in a warning from \transformers\tokenization_utils_base.py:1773 FutureWarning:

The `pad_to_max_length` argument is deprecated and will be removed in a future version, use `padding=True` or `padding='longest'` to pad to the longest sequence in the batch, or use `padding='max_length'` to pad to a max length. In this case, you can give a specific length with `max_length` (e.g. `max_length=45`) or leave max_length to None to pad to the maximal input size of the model (e.g. 512 for Bert).
  FutureWarning,

AttributeError: 'CompletedSubTask' object has no attribute 'aspect_representation'

Hi, I have a following problem: when I start the example you provided, I manage to obtain slack and price sentiments but the problem occurs when I try to create html (html = absa.probing.explain(slack)). This command is followed by the error:

AttributeError Traceback (most recent call last)
in
----> 1 html = absa.probing.explain(slack)

~/anaconda3/envs/torch_tf2/lib/python3.7/site-packages/aspect_based_sentiment_analysis/probing/plots.py in explain(example)
47
48 def explain(example: PredictedExample):
---> 49 aspect = example.aspect_representation
50 texts = [f'Words connected with the "{example.aspect}" aspect:
']
51 texts.extend(highlight_sequence(aspect.tokens, aspect.look_at))

AttributeError: 'CompletedSubTask' object has no attribute 'aspect_representation'

Training custom model

I want to use this package for the mobile phone's ABSA, Can anyone tell me how to train the model and use the trained model in the nlp.load() method.
Actually, I am new to the NLP. Can anyone show the code to do that?
Also, I don't know what I have to do after running the script "train_classifier.py".
This script also showing an error that to "upgrade storage by giving database path " when running this on google collab
!python /content/Aspect-Based-Sentiment-Analysis/examples/train_classifier.py --domain Smartphones.

 Traceback (most recent call last):
  File "/content/Aspect-Based-Sentiment-Analysis/examples/train_classifier.py", line 183, in <module>
    load_if_exists=True)
  File "/usr/local/lib/python3.7/dist-packages/optuna/study.py", line 1055, in create_study
    storage = storages.get_storage(storage)
  File "/usr/local/lib/python3.7/dist-packages/optuna/storages/__init__.py", line 27, in get_storage
    return _CachedStorage(RDBStorage(storage))
  File "/usr/local/lib/python3.7/dist-packages/optuna/storages/_rdb/storage.py", line 173, in __init__
    self._version_manager.check_table_schema_compatibility()
  File "/usr/local/lib/python3.7/dist-packages/optuna/storages/_rdb/storage.py", line 1288, in check_table_schema_compatibility
    raise RuntimeError(message)
RuntimeError: The runtime optuna version 2.5.0 is no longer compatible with the table schema (set up by optuna 2.2.0). Please execute "$ optuna storage upgrade --storage $STORAGE_URL" for upgrading the storage.

The stepwise guide would be awesome.
Thanks for such a good package.

Package import error

I already install the package, but when I import it, I got the following error:
ImportError: cannot import name 'collections_abc' from 'six.moves' (unknown location)

Is there a way to skip this part?

Thank you

Problem load `BERT-ADA` pretrained model

Hi, I want to say the work is really wonderful!

I installed the package and it works! Then I want to see if the other pretained model works, so I downloaded the BERT-ADA pretained model listed on README, specifically, I downloaded laptops_and_restaurants_2mio_ep15.

name = "/home/projects/pre_train_models/laptops_and_restaurants_2mio_ep15"
model = absa.BertABSClassifier.from_pretrained(name)

The error I got is:

OSError: Error no file named ['pytorch_model.bin', 'tf_model.h5'] found in directory /home/projects/pre_train_models/laptops_and_restaurants_2mio_ep15 or `from_pt` set to False

Then I put in from_pt=True

name = "/home/projects/pre_train_models/laptops_and_restaurants_2mio_ep15"
model = absa.BertABSClassifier.from_pretrained(name, from_pt=True)

This time I got AttributeError: 'BertConfig' object has no attribute 'num_polarities'.

I am quite new to this area and experimenting, could you help to solve the issue?

Another question is, where the model is stored? I couldn't find absa/classifier-lapt-0.2 in package folder, or anywhere. I looked at the load.py file, it is supposed to download to a download folder, but I also can't find it. If I find it, I may compare the config.json file with other model.

I have created two conda env, one with transformers==4.2 and another with transformers=2.5, both has the same error indicated above.

Thank you very much!

Strength

Does this model have any ability to support more than a binary classification? Our use case benefits a lot from having a magnitude, something like returning a range from -1 to 1 in float, where 1 is as positive as possible, 0 neutral, -1 negative.

Can't install the package in anaconda!

I wan't it to run in conda jupyter lab. I also followed steps to clone the code to create new conda env, but it didn't work. Somebody pls let me through steps to resolve the issue.

Extract Weights and scores from result

`import numpy as np
import aspect_based_sentiment_analysis as absa
from aspect_based_sentiment_analysis import alignment
from aspect_based_sentiment_analysis import Example

text = "I love mascara"
aspects = ['mascara']

recognizer = absa.aux_models.BasicPatternRecognizer()
nlp = absa.load(pattern_recognizer=recognizer)
task = nlp(text=text, aspects=aspects)
slack = task.examples

print(slack)

[PredictedExample(text='I love mascara', aspect='mascara', sentiment=<Sentiment.positive: 2>, text_tokens=['i', 'love', 'mascara'], text_subtokens=['i', 'love', 'mascara'], aspect_tokens=['mascara'], aspect_subtokens=['mascara'], tokens=['[CLS]', 'i', 'love', 'mascara', '[SEP]', 'mascara', '[SEP]'], subtokens=['[CLS]', 'i', 'love', 'mascara', '[SEP]', 'mascara', '[SEP]'], alignment=[[0], [1], [2], [3], [4], [5], [6]], scores=[0.0005469007, 0.0009526035, 0.99850047], review=Review(is_reference=None, patterns=[Pattern(importance=1.0, tokens=['i', 'love', 'mascara'], weights=[0.28, 1.0, 0.71]), Pattern(importance=0.58, tokens=['i', 'love', 'mascara'], weights=[0.13, 0.58, 0.58]), Pattern(importance=0.25, tokens=['i', 'love', 'mascara'], weights=[0.25, 0.25, 0.17])]))]
`

print("###########")
print("Aspect :",slack.aspect)
print("Sentiment :",slack.sentiment)
print("Scores (neutral/negative/positive): ",slack.scores)
#print("Tokens :",slack.text_subtokens)
#print("Words weights related to the aspect :",slack.review.patterns[0].weights)
word = []
list_numbers = slack.review.patterns[0].weights
g = [i for i, n in enumerate(list_numbers) if n > 0.5] # Generator expression
for i in range(0,len(g)):
word_indx = g[i]
word.append(slack.text_subtokens[word_indx])
print(word)

###########
Result

###########
Aspect : price
Sentiment : Sentiment.positive
Scores (neutral/negative/positive): [0.0005469007, 0.0009526035, 0.99850047]
['love', 'mascara']

#########
Problem
Now I receive each time when I run it again:

AttributeError Traceback (most recent call last)
in
1 print("###########")
----> 2 print("Aspect :",slack.aspect)
3 print("Sentiment :",slack.sentiment)
4 print("Scores (neutral/negative/positive): ",slack.scores)
5 print("Tokens :",slack.text_subtokens)

AttributeError: 'list' object has no attribute 'aspect'

AttributeError Traceback (most recent call last)
in
----> 1 absa.summary(skin)
2 absa.display(skin.scores)

~\Anaconda3\envs\asba_aymen_setup\lib\site-packages\aspect_based_sentiment_analysis\plots.py in summary(example)
64
65 def summary(example: PredictedExample):
---> 66 print(f'{str(example.sentiment)} for "{example.aspect}"')
67 rounded_scores = np.round(example.scores, decimals=3)
68 print(f'Scores (neutral/negative/positive): {rounded_scores}')

AttributeError: 'list' object has no attribute 'sentiment'

Can someone help me out please :)

a problem on training with optuna

Hello! I'm on WIN10 and can't install optuna, can you provide some sample training sessions that are not optimized with optuna? Thank you!

Error in Usage

Hi,

I cloned the repo and was trying to import the module locally. I've installed transformers as well. I'm getting this error.

Python: 3.6

Multiple tokens

Hi, thanks for this great library! Everything works great but I have 2 questions:

Is there any way to accommodate for aspects which have more than one word, e.g. task = nlp(sentence, aspects=['social media']) raises an error:

    149             raise ValueError
    150         if len(example.aspect_tokens) > 1:
--> 151             raise ValueError
    152 
    153     @staticmethod

ValueError:

Are there any other properties of task.batch, such as the overall sentiment score or the words that constitute that sentiment score?

Thanks so much!

ValueError: Input 0 of layer classifier is incompatible with the layer: : expected min_ndim=2, found ndim=0. Full shape received: []

When running an example

Questions about BasicPatternRecognizer

Hi,

Firstly, just want to say what a wonderful resource this is! I have several questions about the BasicPatternRecognizer:

References - Of the references you have provided on how attention values can explain a model decision in simple terms [1, 2, 3, 4, 5 ], none seem to mention the using attention gradients. If possible could you either provided any references that informed your thinking on this or provide some intuition for why your method works. For example, in BasicPatternRecognizer you use x = tf.reduce_sum(x, axis=[0, 1], keepdims=True) to combine the attention_scores * gradients for all heads and layers. I think I understand why this works, but I've never seen it done before.
Patterns - I am unclear on what some of the code used to construct the patterns is doing. In particular, I don't understand the line w = x[0, text_mask] , specifically, what 0 is doing; why do we care about the first row and why do we use it to calculate the importance of a given pattern?
Combine patterns into one - I would like to be able to create a single visualization of what tokens the model treats as important of a given aspect. I have some ideas, like scaling the weights by importace and combining, but I really need to first understand the motivation behind the importance metric to do this. Do you have any thoughts of resources I could look at to achieve this? (I would also like to be able to extract the tokens / words which are most important of deciding the sentiment of a given aspect)

Thank you so much!

Josh

Cannot install via pip

I tried to install the package via pip, which returns the following error.

% pip install aspect-based-sentiment-analysis
ERROR: Could not find a version that satisfies the requirement aspect-based-sentiment-analysis (from versions: none)
ERROR: No matching distribution found for aspect-based-sentiment-analysis

Any ideas?

ValueError: The first argument to `Layer.call` must always be passed.

I install your module and run quick start

import aspect_based_sentiment_analysis as absa
recognizer = absa.aux_models.BasicPatternRecognizer()
nlp = absa.load(pattern_recognizer=recognizer)

and the following error occurred.

ValueError                                Traceback (most recent call last)
<ipython-input-356-900f8907a6c9> in <module>
      2 
      3 recognizer = absa.aux_models.BasicPatternRecognizer()
----> 4 nlp = absa.load(pattern_recognizer=recognizer)

~/anaconda3/envs/myenv1/lib/python3.8/site-packages/aspect_based_sentiment_analysis/loads.py in load(name, text_splitter, reference_recognizer, pattern_recognizer, **model_kwargs)
     32     try:
     33         config = BertABSCConfig.from_pretrained(name, **model_kwargs)
---> 34         model = BertABSClassifier.from_pretrained(name, config=config)
     35         tokenizer = transformers.BertTokenizer.from_pretrained(name)
     36         professor = Professor(reference_recognizer, pattern_recognizer)

~/anaconda3/envs/myenv1/lib/python3.8/site-packages/transformers/modeling_tf_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)

~/anaconda3/envs/myenv1/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py in __call__(self, *args, **kwargs)
   1010   def trainable(self, value):
   1011     self._trainable = value
-> 1012     for layer in getattr(self, '_layers', []):
   1013       layer.trainable = value
   1014 

~/anaconda3/envs/myenv1/lib/python3.8/site-packages/aspect_based_sentiment_analysis/models.py in call(self, token_ids, attention_mask, token_type_ids, training, **bert_kwargs)
    139             **bert_kwargs
    140     ) -> Tuple[tf.Tensor, Tuple[tf.Tensor, ...], Tuple[tf.Tensor, ...]]:
--> 141         outputs = self.bert(
    142             inputs=token_ids,
    143             attention_mask=attention_mask,

~/anaconda3/envs/myenv1/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py in __call__(self, *args, **kwargs)
    940             # TODO(fchollet): consider py_func as an alternative, which
    941             # would enable us to run the underlying graph if needed.
--> 942             outputs = self._symbolic_call(inputs)
    943 
    944           if outputs is None:

~/anaconda3/envs/myenv1/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py in _split_out_first_arg(self, args, kwargs)

ValueError: The first argument to `Layer.call` must always be passed.

Can I know the solution?

NOT WORKING - Local Jupyter and Google Colab!!

I am using latest Python 3.7....
While in Colab it gives the error like: ModuleNotFoundError: No module named 'aspect_based_sentiment_analysis'

In Jupyter it gives an error as below:

ImportError Traceback (most recent call last)
~\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow.py in
57
---> 58 from tensorflow.python.pywrap_tensorflow_internal import *
59

~\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py in
27 return _mod
---> 28 _pywrap_tensorflow_internal = swig_import_helper()
29 del swig_import_helper

~\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py in swig_import_helper()
23 try:
---> 24 _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
25 finally:

~\Anaconda3\lib\imp.py in load_module(name, file, filename, details)
241 else:
--> 242 return load_dynamic(name, filename, file)
243 elif type_ == PKG_DIRECTORY:

~\Anaconda3\lib\imp.py in load_dynamic(name, path, file)
341 name=name, loader=loader, origin=path)
--> 342 return _load(spec)
343

ImportError: DLL load failed: The specified module could not be found.

During handling of the above exception, another exception occurred:

ImportError Traceback (most recent call last)
in
----> 1 import aspect_based_sentiment_analysis as absa
2
3 nlp = absa.load()
4 text = ("We are great fans of Slack, but we wish the subscriptions "
5 "were more accessible to small startups.")

~\Anaconda3\lib\site-packages\aspect_based_sentiment_analysis_init_.py in
2 version = "1.1.2"
3
----> 4 from .alignment import tokenize
5 from .alignment import make_alignment
6 from .alignment import merge_input_attentions

~\Anaconda3\lib\site-packages\aspect_based_sentiment_analysis\alignment.py in
4 from typing import Tuple
5
----> 6 import tensorflow as tf
7 import transformers
8 import numpy as np

~\Anaconda3\lib\site-packages\tensorflow_init_.py in
39 import sys as _sys
40
---> 41 from tensorflow.python.tools import module_util as _module_util
42 from tensorflow.python.util.lazy_loader import LazyLoader as _LazyLoader
43

~\Anaconda3\lib\site-packages\tensorflow\python_init_.py in
48 import numpy as np
49
---> 50 from tensorflow.python import pywrap_tensorflow
51
52 # Protocol buffers

~\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow.py in
67 for some common reasons and solutions. Include the entire stack trace
68 above this error message when asking for help.""" % traceback.format_exc()
---> 69 raise ImportError(msg)
70
71 # pylint: enable=wildcard-import,g-import-not-at-top,unused-import,line-too-long

ImportError: Traceback (most recent call last):
File "C:\Users\ddebnath2\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 58, in
from tensorflow.python.pywrap_tensorflow_internal import *
File "C:\Users\ddebnath2\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 28, in
_pywrap_tensorflow_internal = swig_import_helper()
File "C:\Users\ddebnath2\Anaconda3\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
File "C:\Users\ddebnath2\Anaconda3\lib\imp.py", line 242, in load_module
return load_dynamic(name, filename, file)
File "C:\Users\ddebnath2\Anaconda3\lib\imp.py", line 342, in load_dynamic
return _load(spec)
ImportError: DLL load failed: The specified module could not be found.

Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/errors

for some common reasons and solutions. Include the entire stack trace
above this error message when asking for help.

ResolvePackageNotFound: - spacy=2.4 when using "create env create -f=environment.yml"

when using "create env create -f=environment.yml" the following error is shown
ResolvePackageNotFound:

spacy=2.4

Do you mean 2.2.4? the lastest version is 2.2.4

Update dependencies for usability

Hi all,

This seems like a great library, but it's currently unusable because of the outdated dependences (tensorflow==2.2, transformers==2.5) which conflict with the versions required by other modern NLP libraries (e.g. Sentence-Transformers) that are often used in the same data pipelines.

I see you've updated the tensorflow dependency in the master github version, but have not pushed it to the pip installable version.

Please update the pip-installable library to work with current versions of tensorflow and transformers. This would make your library much more usable and increase adoption by others.

About neural sentiment

nlp = absa.load()
text = ("We are great fans of Slack, but we wish the subscriptions "
"were more accessible to small startups.")
slack, price = nlp(text, aspects=['slack', 'price'])

It seems that the model only classify the sentiment by positive and negative. Can the model detect neural sentiment?

module 'aspect_based_sentiment_analysis' has no attribute 'BertTokenizer'

name = 'absa/classifier-rest-0.2'
model = absa.BertABSClassifier.from_pretrained(name)
tokenizer = absa.BertTokenizer.from_pretrained(name)
professor = absa.Professor(...) # Explained in detail later on.
text_splitter = absa.sentencizer() # The English CNN model from SpaCy.
nlp = absa.Pipeline(model, tokenizer, professor, text_splitter)

I got this error module 'aspect_based_sentiment_analysis' has no attribute 'BertTokenizer' while runnnig the above code , any ideas ?

Probing is not working

Cloned repository, installed dependencies and ran patters.ipynb example.
It fails on the last step (absa.probing.explain) with error attached:

End to End ABSA

I think this repository is a wonderful resource, however what seems like a natural extension to the pipeline, is prepending an Aspect Term Extraction module, so as to provide an option to perform End to End ABSA.

Would love to hear thoughts of the authors of this repository on the same.

AttributeError with Tokenizer

I'm trying to reproduce the example in the README.

name = 'absa/classifier-rest-0.2'
model = absa.BertABSClassifier.from_pretrained(name)
tokenizer = absa.BertTokenizer.from_pretrained(name)
professor = absa.Professor()     # Explained in detail later on.
text_splitter = absa.sentencizer()  # The English CNN model from SpaCy.
nlp = absa.Pipeline(model, tokenizer, professor, text_splitter)

But I get an AttributeError with the tokenizer.

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-9-c6e986c7be44> in <module>
      1 name = 'absa/classifier-rest-0.2'
      2 model = absa.BertABSClassifier.from_pretrained(name)
----> 3 tokenizer = absa.BertTokenizer.from_pretrained(name)
      4 professor = absa.Professor()     # Explained in detail later on.
      5 text_splitter = absa.sentencizer()  # The English CNN model from SpaCy.

AttributeError: module 'aspect_based_sentiment_analysis' has no attribute 'BertTokenizer'

Could you also clarify how the professor works. The article is missing the hyperlink in the README: "In the article [here], we discuss in detail how the model and the professor work"

Thanks in advance.

About the AE feature

can I use this for aspect extraction task as an end2end sentiment analysis tool?

Attribute error with spacy text splitter

Hi there!

When I try to follow the pipeline steps laid out in the README exactly, I receive the following error at the preprocessing stage:

AttributeError: 'spacy.tokens.span.Span' object has no attribute 'string'

Upon removing the text_splitter from the pipeline setup I no longer get this error, but it would be useful to be able to initialize the pipeline with the text splitter (e.g. for passing in texts whose tokenization is longer than 512 tokens).

Thank you very much for the help!

value error

@marioosh @lkuczera @molowny @marekklis @jczuchnowski

ValueError Traceback (most recent call last)
in
1 import aspect_based_sentiment_analysis as absa
2
----> 3 nlp = absa.load()
4 text = ("We are great fans of Slack, but we wish the subscriptions "
5 "were more accessible to small startups.")

D:\rj\ana3\lib\site-packages\aspect_based_sentiment_analysis\loads.py in load(name, text_splitter, reference_recognizer, pattern_recognizer, **model_kwargs)
32 try:
33 config = BertABSCConfig.from_pretrained(name, **model_kwargs)
---> 34 model = BertABSClassifier.from_pretrained(name, config=config)
35 tokenizer = transformers.BertTokenizer.from_pretrained(name)
36 professor = Professor(reference_recognizer, pattern_recognizer)

D:\rj\ana3\lib\site-packages\transformers\modeling_tf_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
728 return load_pytorch_checkpoint_in_tf2_model(model, resolved_archive_file, allow_missing_keys=True)
729
--> 730 model(model.dummy_inputs, training=False) # build the network with dummy inputs
731
732 assert os.path.isfile(resolved_archive_file), "Error retrieving file {}".format(resolved_archive_file)

D:\rj\ana3\lib\site-packages\tensorflow\python\keras\engine\base_layer.py in call(self, *args, **kwargs)
983
984 with ops.enable_auto_cast_variables(self._compute_dtype_object):
--> 985 outputs = call_fn(inputs, *args, **kwargs)
986
987 if self._activity_regularizer:

D:\rj\ana3\lib\site-packages\aspect_based_sentiment_analysis\models.py in call(self, token_ids, attention_mask, token_type_ids, training, **bert_kwargs)
148 sequence_output, pooled_output, hidden_states, attentions = outputs
149 pooled_output = self.dropout(pooled_output, training=training)
--> 150 logits = self.classifier(pooled_output)
151 return logits, hidden_states, attentions

D:\rj\ana3\lib\site-packages\tensorflow\python\keras\engine\base_layer.py in call(self, *args, **kwargs)
980 with ops.name_scope_v2(name_scope):
981 if not self.built:
--> 982 self._maybe_build(inputs)
983
984 with ops.enable_auto_cast_variables(self._compute_dtype_object):

D:\rj\ana3\lib\site-packages\tensorflow\python\keras\engine\base_layer.py in _maybe_build(self, inputs)
2616 if not self.built:
2617 input_spec.assert_input_compatibility(
-> 2618 self.input_spec, inputs, self.name)
2619 input_list = nest.flatten(inputs)
2620 if input_list and self._dtype_policy.compute_dtype is None:

D:\rj\ana3\lib\site-packages\tensorflow\python\keras\engine\input_spec.py in assert_input_compatibility(input_spec, inputs, layer_name)
194 ', found ndim=' + str(ndim) +
195 '. Full shape received: ' +
--> 196 str(x.shape.as_list()))
197 # Check dtype.
198 if spec.dtype is not None:

ValueError: Input 0 of layer classifier is incompatible with the layer: : expected min_ndim=2, found ndim=0. Full shape received: []

what should I do about this error ? thx~

Pre-trained models

I want to start off by saying that I really love your work! Is there any chance that you provide the pre-trained models somewhere?

Cannot install via pip

When I run pip install aspect_based_sentiment_analysis it gets stuck indefinitely downloading boto3, after saying

INFO: pip is looking at multiple versions of boto3 to determine which version is compatible with other requirements. This could take a while.

AM getting Error when I try to run the First code in the readme

ValueError Traceback (most recent call last)
in
1 recognizer = absa.aux_models.BasicPatternRecognizer()
----> 2 nlp = absa.load('absa/classifier-rest-0.2',pattern_recognizer=recognizer)
3 text=('We are great fans of Slack, but we wish the subscriptions')
4 completed_task = nlp(text, aspects=['slack', 'price'])
5 slack, price = completed_task.examples

~\Anaconda3\envs\ABSA\lib\site-packages\aspect_based_sentiment_analysis\loads.py in load(name, text_splitter, reference_recognizer, pattern_recognizer, **model_kwargs)
32 try:
33 config = BertABSCConfig.from_pretrained(name, **model_kwargs)
---> 34 model = BertABSClassifier.from_pretrained(name, config=config)
35 tokenizer = transformers.BertTokenizer.from_pretrained(name)
36 professor = Professor(reference_recognizer, pattern_recognizer)

~\Anaconda3\envs\ABSA\lib\site-packages\transformers\modeling_tf_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
728 return load_pytorch_checkpoint_in_tf2_model(model, resolved_archive_file, allow_missing_keys=True)
729
--> 730 model(model.dummy_inputs, training=False) # build the network with dummy inputs
731
732 assert os.path.isfile(resolved_archive_file), "Error retrieving file {}".format(resolved_archive_file)

~\Anaconda3\envs\ABSA\lib\site-packages\tensorflow\python\keras\engine\base_layer.py in call(self, *args, **kwargs)
983
984 with ops.enable_auto_cast_variables(self._compute_dtype_object):
--> 985 outputs = call_fn(inputs, *args, **kwargs)
986
987 if self._activity_regularizer:

~\Anaconda3\envs\ABSA\lib\site-packages\aspect_based_sentiment_analysis\models.py in call(self, token_ids, attention_mask, token_type_ids, training, **bert_kwargs)
148 sequence_output, pooled_output, hidden_states, attentions = outputs
149 pooled_output = self.dropout(pooled_output, training=training)
--> 150 logits = self.classifier(pooled_output)
151 return logits, hidden_states, attentions

~\Anaconda3\envs\ABSA\lib\site-packages\tensorflow\python\keras\engine\base_layer.py in call(self, *args, **kwargs)
980 with ops.name_scope_v2(name_scope):
981 if not self.built:
--> 982 self._maybe_build(inputs)
983
984 with ops.enable_auto_cast_variables(self._compute_dtype_object):

~\Anaconda3\envs\ABSA\lib\site-packages\tensorflow\python\keras\engine\base_layer.py in _maybe_build(self, inputs)
2615 # Check input assumptions set before layer building, e.g. input rank.
2616 if not self.built:
-> 2617 input_spec.assert_input_compatibility(
2618 self.input_spec, inputs, self.name)
2619 input_list = nest.flatten(inputs)

~\Anaconda3\envs\ABSA\lib\site-packages\tensorflow\python\keras\engine\input_spec.py in assert_input_compatibility(input_spec, inputs, layer_name)
189 ndim = x.shape.ndims
190 if ndim is not None and ndim < spec.min_ndim:
--> 191 raise ValueError('Input ' + str(input_index) + ' of layer ' +
192 layer_name + ' is incompatible with the layer: '
193 ': expected min_ndim=' + str(spec.min_ndim) +

ValueError: Input 0 of layer classifier is incompatible with the layer: : expected min_ndim=2, found ndim=0. Full shape received: []

and here is my code:
Am not sure enough where I am wrong

import aspect_based_sentiment_analysis as absa
recognizer = absa.aux_models.BasicPatternRecognizer()
nlp = absa.load('absa/classifier-rest-0.2',pattern_recognizer=recognizer)
text=('We are great fans of Slack, but we wish the subscriptions')
completed_task = nlp(text, aspects=['slack', 'price'])
slack, price = completed_task.examples

OOM errors on long text

I'm still getting OOM errors despite splitting the text into sentences using the text_splitter = absa.sentencizer()

I have for instance a text of 4528 characters that gets split up into 43 sentences (the largest of which is 163 characters long) that throws an OOM error. Any tips/ideas how I could handle such cases?

tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0,529] = 529 is not in [0, 512) [Op:ResourceGather]

I search on internet and find that this issue is due the length of the 'text' I am passing to the below nlp function.

import aspect_based_sentiment_analysis as absa
name = 'absa/classifier-lapt-0.2'
recognizer = absa.aux_models.BasicPatternRecognizer()
nlp = absa.load(name,pattern_recognizer=recognizer)
text= " " #some multiline long review text
completed_task = nlp(text, aspects=['camera','design'])
camera,design = completed_task.examples

print(camera.sentiment)

when I reduced the size of the text. It is working fine. Any other solution to this

Traceback (most recent call last):
---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-28-07dbd638851c> in <module>
     17 aspect='camera'
     18 camera=0
---> 19 completed_task = nlp(text, aspects=['camera','design'])
     20 camera,design = completed_task.examples

~\AppData\Local\Programs\Python\Python37\lib\site-packages\aspect_based_sentiment_analysis\pipelines.py in __call__(self, text, aspects)
    206     def __call__(self, text: str, aspects: List[str]) -> CompletedTask:
    207         task = self.preprocess(text, aspects)
--> 208         predictions = self.transform(task.examples)
    209         completed_task = self.postprocess(task, predictions)
    210         return completed_task

~\AppData\Local\Programs\Python\Python37\lib\site-packages\aspect_based_sentiment_analysis\pipelines.py in transform(self, examples)
    222         tokenized_examples = self.tokenize(examples)
    223         input_batch = self.encode(tokenized_examples)
--> 224         output_batch = self.predict(input_batch)
    225         predictions = self.review(tokenized_examples, output_batch)
    226         return predictions

~\AppData\Local\Programs\Python\Python37\lib\site-packages\aspect_based_sentiment_analysis\pipelines.py in predict(self, input_batch)
    252                 token_ids=input_batch.token_ids,
    253                 attention_mask=input_batch.attention_mask,
--> 254                 token_type_ids=input_batch.token_type_ids
    255             )
    256             # We assume that our predictions are correct. This is

~\AppData\Local\Programs\Python\Python37\lib\site-packages\aspect_based_sentiment_analysis\models.py in call(self, token_ids, attention_mask, token_type_ids, training, **bert_kwargs)
    144             token_type_ids=token_type_ids,
    145             training=training,
--> 146             **bert_kwargs
    147         )
    148         sequence_output, pooled_output, hidden_states, attentions = outputs

~\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\keras\engine\base_layer.py in __call__(self, *args, **kwargs)
    966           with base_layer_utils.autocast_context_manager(
    967               self._compute_dtype):
--> 968             outputs = self.call(cast_inputs, *args, **kwargs)
    969           self._handle_activity_regularization(inputs, outputs)
    970           self._set_mask_metadata(inputs, outputs, input_masks)

~\AppData\Local\Programs\Python\Python37\lib\site-packages\transformers\modeling_tf_bert.py in call(self, inputs, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, training)
    564             # head_mask = tf.constant([0] * self.num_hidden_layers)
    565 
--> 566         embedding_output = self.embeddings([input_ids, position_ids, token_type_ids, inputs_embeds], training=training)
    567         encoder_outputs = self.encoder([embedding_output, extended_attention_mask, head_mask], training=training)
    568 

~\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\keras\engine\base_layer.py in __call__(self, *args, **kwargs)
    966           with base_layer_utils.autocast_context_manager(
    967               self._compute_dtype):
--> 968             outputs = self.call(cast_inputs, *args, **kwargs)
    969           self._handle_activity_regularization(inputs, outputs)
    970           self._set_mask_metadata(inputs, outputs, input_masks)

~\AppData\Local\Programs\Python\Python37\lib\site-packages\transformers\modeling_tf_bert.py in call(self, inputs, mode, training)
    146         """
    147         if mode == "embedding":
--> 148             return self._embedding(inputs, training=training)
    149         elif mode == "linear":
    150             return self._linear(inputs)

~\AppData\Local\Programs\Python\Python37\lib\site-packages\transformers\modeling_tf_bert.py in _embedding(self, inputs, training)
    169         if inputs_embeds is None:
    170             inputs_embeds = tf.gather(self.word_embeddings, input_ids)
--> 171         position_embeddings = self.position_embeddings(position_ids)
    172         token_type_embeddings = self.token_type_embeddings(token_type_ids)
    173 

~\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\keras\engine\base_layer.py in __call__(self, *args, **kwargs)
    966           with base_layer_utils.autocast_context_manager(
    967               self._compute_dtype):
--> 968             outputs = self.call(cast_inputs, *args, **kwargs)
    969           self._handle_activity_regularization(inputs, outputs)
    970           self._set_mask_metadata(inputs, outputs, input_masks)

~\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\keras\layers\embeddings.py in call(self, inputs)
    182     if dtype != 'int32' and dtype != 'int64':
    183       inputs = math_ops.cast(inputs, 'int32')
--> 184     out = embedding_ops.embedding_lookup(self.embeddings, inputs)
    185     return out
    186 

~\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\ops\embedding_ops.py in embedding_lookup(params, ids, partition_strategy, name, validate_indices, max_norm)
    324       name=name,
    325       max_norm=max_norm,
--> 326       transform_fn=None)
    327 
    328 

~\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\ops\embedding_ops.py in _embedding_lookup_and_transform(params, ids, partition_strategy, name, max_norm, transform_fn)
    135       with ops.colocate_with(params[0]):
    136         result = _clip(
--> 137             array_ops.gather(params[0], ids, name=name), ids, max_norm)
    138         if transform_fn:
    139           result = transform_fn(result)

~\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\util\dispatch.py in wrapper(*args, **kwargs)
    178     """Call target, and fall back on dispatchers if there is a TypeError."""
    179     try:
--> 180       return target(*args, **kwargs)
    181     except (TypeError, ValueError):
    182       # Note: convert_to_eager_tensor currently raises a ValueError, not a

~\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\ops\array_ops.py in gather(***failed resolving arguments***)
   4520     # TODO(apassos) find a less bad way of detecting resource variables
   4521     # without introducing a circular dependency.
-> 4522     return params.sparse_read(indices, name=name)
   4523   except AttributeError:
   4524     return gen_array_ops.gather_v2(params, indices, axis, name=name)

~\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py in sparse_read(self, indices, name)
    674       variable_accessed(self)
    675       value = gen_resource_variable_ops.resource_gather(
--> 676           self._handle, indices, dtype=self._dtype, name=name)
    677 
    678       if self._dtype == dtypes.variant:

~\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\ops\gen_resource_variable_ops.py in resource_gather(resource, indices, dtype, batch_dims, validate_indices, name)
    554         pass  # Add nodes to the TensorFlow graph.
    555     except _core._NotOkStatusException as e:
--> 556       _ops.raise_from_not_ok_status(e, name)
    557   # Add nodes to the TensorFlow graph.
    558   dtype = _execute.make_type(dtype, "dtype")

~\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\framework\ops.py in raise_from_not_ok_status(e, name)
   6651   message = e.message + (" name: " + name if name is not None else "")
   6652   # pylint: disable=protected-access
-> 6653   six.raise_from(core._status_to_exception(e.code, message), None)
   6654   # pylint: enable=protected-access
   6655 

~\AppData\Local\Programs\Python\Python37\lib\site-packages\six.py in raise_from(value, from_value)

InvalidArgumentError: indices[0,529] = 529 is not in [0, 512) [Op:ResourceGather]

Training on new data

Can you please provide clear steps on training the model on new data

s

Assertion Error Fixed

OOM issue during training of classifier on GPU instance

Hi there,

I am trying to train own models based on the provided template and it works well on a CPU machine with small training data.

However, when enlarging the training data (>10 sentences or even >100k sentences) I receive an OOM error message. This seems to always happen on GPU instances (tried on AWS with up to G3.16xlarge https://aws.amazon.com/de/ec2/instance-types/g3/ ).

Here is the part of the error message:

2020-08-22T12:59:27.126+02:00 | 2020-08-22 10:59:27.126292: W tensorflow/core/common_runtime/bfc_allocator.cc:434] Allocator (GPU_0_bfc) ran out of memory trying to allocate 48.00MiB (rounded to 50331648)
-- | --
  | 2020-08-22T12:59:27.126+02:00 | Current allocation summary follows.
  | 2020-08-22T12:59:27.126+02:00 | 2020-08-22 10:59:27.126394: I tensorflow/core/common_runtime/bfc_allocator.cc:934] BFCAllocator dump for GPU_0_bfc
  | 2020-08-22T12:59:27.126+02:00 | 2020-08-22 10:59:27.126408: I tensorflow/core/common_runtime/bfc_allocator.cc:941] Bin (256): Total Chunks: 56, Chunks in use: 56. 14.0KiB allocated for chunks. 14.0KiB in use in bin. 384B client-requested in use in bin.

...


2020-08-22T12:59:27.319+02:00 | File "/usr/local/lib/python3.7/dist-packages/optuna/study.py", line 331, in optimize
-- | --
  | 2020-08-22T12:59:27.319+02:00 | func, n_trials, timeout, catch, callbacks, gc_after_trial, None
  | 2020-08-22T12:59:27.319+02:00 | File "/usr/local/lib/python3.7/dist-packages/optuna/study.py", line 626, in _optimize_sequential
  | 2020-08-22T12:59:27.319+02:00 | self._run_trial_and_callbacks(func, catch, callbacks, gc_after_trial)
  | 2020-08-22T12:59:27.319+02:00 | File "/usr/local/lib/python3.7/dist-packages/optuna/study.py", line 656, in _run_trial_and_callbacks
  | 2020-08-22T12:59:27.320+02:00 | trial = self._run_trial(func, catch, gc_after_trial)
  | 2020-08-22T12:59:27.320+02:00 | File "/usr/local/lib/python3.7/dist-packages/optuna/study.py", line 677, in _run_trial
  | 2020-08-22T12:59:27.320+02:00 | result = func(trial)
  | 2020-08-22T12:59:27.320+02:00 | File "/usr/local/lib/python3.7/dist-packages/txtclassification/fine_tune_absa.py", line 236, in objective
  | 2020-08-22T12:59:27.320+02:00 | return experiment(local_folder_name=local_folder_name, **params)
  | 2020-08-22T12:59:27.320+02:00 | File "/usr/local/lib/python3.7/dist-packages/txtclassification/fine_tune_absa.py", line 160, in experiment
  | 2020-08-22T12:59:27.320+02:00 | test_dataset, callbacks, strategy)
  | 2020-08-22T12:59:27.320+02:00 | File "/usr/local/lib/python3.7/dist-packages/aspect_based_sentiment_analysis/training/classifier.py", line 60, in train_classifier
  | 2020-08-22T12:59:27.320+02:00 | callbacks=callbacks
  | 2020-08-22T12:59:27.320+02:00 | File "/usr/local/lib/python3.7/dist-packages/aspect_based_sentiment_analysis/training/routines.py", line 29, in train
  | 2020-08-22T12:59:27.320+02:00 | train_loop(train_step, train_dataset, callbacks, strategy)
  | 2020-08-22T12:59:27.320+02:00 | File "/usr/local/lib/python3.7/dist-packages/aspect_based_sentiment_analysis/training/routines.py", line 44, in train_loop
  | 2020-08-22T12:59:27.320+02:00 | train_step_outputs = step(tf_batch)
  | 2020-08-22T12:59:27.320+02:00 | File "/usr/local/lib/python3.7/dist-packages/aspect_based_sentiment_analysis/training/routines.py", line 62, in one_device
  | 2020-08-22T12:59:27.320+02:00 | return strategy.experimental_run_v2(step, args=batch)
  | 2020-08-22T12:59:27.320+02:00 | File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/deprecation.py", line 324, in new_func
  | 2020-08-22T12:59:27.321+02:00 | return func(*args, **kwargs)
  | 2020-08-22T12:59:27.321+02:00 | File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py", line 957, in experimental_run_v2
  | 2020-08-22T12:59:27.321+02:00 | return self.run(fn, args=args, kwargs=kwargs, options=options)
  | 2020-08-22T12:59:27.321+02:00 | File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/one_device_strategy.py", line 182, in run
  | 2020-08-22T12:59:27.321+02:00 | return super(OneDeviceStrategy, self).run(fn, args, kwargs, options)
  | 2020-08-22T12:59:27.321+02:00 | File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py", line 951, in run
  | 2020-08-22T12:59:27.321+02:00 | return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
  | 2020-08-22T12:59:27.321+02:00 | File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py", line 2290, in call_for_each_replica
  | 2020-08-22T12:59:27.322+02:00 | return self._call_for_each_replica(fn, args, kwargs)
  | 2020-08-22T12:59:27.322+02:00 | File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/one_device_strategy.py", line 362, in _call_for_each_replica
  | 2020-08-22T12:59:27.322+02:00 | return fn(*args, **kwargs)
  | 2020-08-22T12:59:27.322+02:00 | File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/autograph/impl/api.py", line 282, in wrapper
  | 2020-08-22T12:59:27.322+02:00 | return func(*args, **kwargs)
  | 2020-08-22T12:59:27.322+02:00 | File "/usr/local/lib/python3.7/dist-packages/aspect_based_sentiment_analysis/training/classifier.py", line 31, in train_step
  | 2020-08-22T12:59:27.322+02:00 | training=True
  | 2020-08-22T12:59:27.322+02:00 | File "/usr/local/lib/python3.7/dist-packages/aspect_based_sentiment_analysis/models.py", line 147, in call
  | 2020-08-22T12:59:27.322+02:00 | **bert_kwargs
  | 2020-08-22T12:59:27.322+02:00 | File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 968, in __call__
  | 2020-08-22T12:59:27.323+02:00 | outputs = self.call(cast_inputs, *args, **kwargs)
  | 2020-08-22T12:59:27.323+02:00 | File "/usr/local/lib/python3.7/dist-packages/transformers/modeling_tf_bert.py", line 572, in call
  | 2020-08-22T12:59:27.323+02:00 | encoder_outputs = self.encoder([embedding_output, extended_attention_mask, head_mask], training=training)
  | 2020-08-22T12:59:27.323+02:00 | File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 968, in __call__
  | 2020-08-22T12:59:27.323+02:00 | outputs = self.call(cast_inputs, *args, **kwargs)
  | 2020-08-22T12:59:27.323+02:00 | File "/usr/local/lib/python3.7/dist-packages/transformers/modeling_tf_bert.py", line 378, in call
  | 2020-08-22T12:59:27.323+02:00 | layer_outputs = layer_module([hidden_states, attention_mask, head_mask[i]], training=training)
  | 2020-08-22T12:59:27.323+02:00 | File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 968, in __call__
  | 2020-08-22T12:59:27.324+02:00 | outputs = self.call(cast_inputs, *args, **kwargs)
  | 2020-08-22T12:59:27.324+02:00 | File "/usr/local/lib/python3.7/dist-packages/transformers/modeling_tf_bert.py", line 356, in call
  | 2020-08-22T12:59:27.324+02:00 | intermediate_output = self.intermediate(attention_output)
  | 2020-08-22T12:59:27.324+02:00 | File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 968, in __call__
  | 2020-08-22T12:59:27.324+02:00 | outputs = self.call(cast_inputs, *args, **kwargs)
  | 2020-08-22T12:59:27.324+02:00 | File "/usr/local/lib/python3.7/dist-packages/transformers/modeling_tf_bert.py", line 322, in call
  | 2020-08-22T12:59:27.324+02:00 | hidden_states = self.intermediate_act_fn(hidden_states)
  | 2020-08-22T12:59:27.324+02:00 | File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 968, in __call__
  | 2020-08-22T12:59:27.325+02:00 | outputs = self.call(cast_inputs, *args, **kwargs)
  | 2020-08-22T12:59:27.325+02:00 | File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/layers/core.py", line 420, in call
  | 2020-08-22T12:59:27.325+02:00 | return self.activation(inputs)
  | 2020-08-22T12:59:27.325+02:00 | File "/usr/local/lib/python3.7/dist-packages/transformers/modeling_tf_bert.py", line 65, in gelu
  | 2020-08-22T12:59:27.325+02:00 | cdf = 0.5 * (1.0 + tf.math.erf(x / tf.math.sqrt(2.0)))
  | 2020-08-22T12:59:27.325+02:00 | File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/math_ops.py", line 1010, in r_binary_op_wrapper
  | 2020-08-22T12:59:27.325+02:00 | return func(x, y, name=name)
  | 2020-08-22T12:59:27.325+02:00 | File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/math_ops.py", line 1276, in _add_dispatch
  | 2020-08-22T12:59:27.325+02:00 | return gen_math_ops.add_v2(x, y, name=name)
  | 2020-08-22T12:59:27.325+02:00 | File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 480, in add_v2
  | 2020-08-22T12:59:27.326+02:00 | _ops.raise_from_not_ok_status(e, name)
  | 2020-08-22T12:59:27.326+02:00 | File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py", line 6653, in raise_from_not_ok_status
  | 2020-08-22T12:59:27.327+02:00 | six.raise_from(core._status_to_exception(e.code, message), None)
  | 2020-08-22T12:59:27.327+02:00 | File "<string>", line 3, in raise_from
  | 2020-08-22T12:59:27.327+02:00 | tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[32,128,3072] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [Op:AddV2]

Would be great if you could provide any suggestions how to solve.

Thank you and best,
Tobias

Can not import OrderDict from typing on google colab

Can't use the library on M1 mac

It seems like the library hasn't been updated for the latest version of Tensorflow 2.4.0-rc0. Tensorflow 2.4.0-rc0 is the only version available on Apple silicon and it's working perfectly. But the library doesn't seem to support Tensorflow versions >2.2. The accepted tensorflow version list needs to be updated

Sentiment Analysis Example Code Output Improvement

I am running the Readme file example.

from transformers import BertTokenizer`
name = 'absa/classifier-lapt-0.2'`
model = absa.BertABSClassifier.from_pretrained(name)
tokenizer = BertTokenizer.from_pretrained(name)
professor = absa.Professor()     # Explained in detail later on.
text_splitter = absa.sentencizer()  # The English CNN model from SpaCy.
nlp = absa.Pipeline(model, tokenizer, professor, text_splitter)
task = nlp(text="the laptop has excellent design.But battery life is not as my per my expectation.", aspects=['design','Battery','processor'])
tokenized_examples = nlp.tokenize(task.examples)
input_batch = nlp.encode(tokenized_examples)
output_batch = nlp.predict(input_batch)
predictions = nlp.review(tokenized_examples, output_batch)
completed_task = nlp.postprocess(task, predictions)

When I run this:
absa.summary(design)
I am getting right output i.e.
Sentiment.positive for "design" Scores (neutral/negative/positive): [0.028 0.057 0.915]

But when I run this
absa.summary(processor)
I am getting negative sentiment with a score above 0.9. Instead, I should get a neutral sentiment as the "processor" aspect is not included in the text. There is no sentence talking about the processor (performance).
If I want to do it such that the aspects which are not included in the text should be assigned a neutral sentiment instead of positive or negative how to do it?
Please If anyone has the idea or code for that add it here.

Unable to reach good accuracy on Training using the train_classifier.py

examples = absa.load_examples(domain=domain)
when I load data using the above code for laptop. I am getting an accuracy of only 42%.
Can you please guide me on any errors I am using.
What specifications of gpu or memory on gpu are we expected to train on

While training on custom data as well reaching only 35%

Pre-trained laptop classifier accuracy does not match what is stated in README

I replaced the domain and classifier in test_performance.py from restaurants to laptops and I get an accuracy of 0.38 (the accuracy stated in the README is 0.8)

This was the code:

import numpy as np
import aspect_based_sentiment_analysis as absa
from aspect_based_sentiment_analysis.training import ConfusionMatrix

def test_semeval_classification_laptops():
    examples = absa.load_examples(dataset='semeval',
                                  domain='laptop',
                                  test=True)
    nlp = absa.load('absa/bert-lapt-0.1')

    metric = ConfusionMatrix(num_classes=3)
    confusion_matrix = nlp.evaluate(examples, metric, batch_size=32)
    confusion_matrix = confusion_matrix.numpy()
    accuracy = np.diagonal(confusion_matrix).sum() / confusion_matrix.sum()

    print(confusion_matrix)
    print(accuracy)

not able run this model in docker

Traceback (most recent call last):
  File "kfserving-absa.py", line 30, in <module>
    model.load()
  File "kfserving-absa.py", line 14, in load
    self.nlp = absa.load()
  File "/usr/local/lib/python3.7/site-packages/aspect_based_sentiment_analysis/loads.py", line 34, in load
    model = BertABSClassifier.from_pretrained(name, config=config)
  File "/usr/local/lib/python3.7/site-packages/transformers/modeling_tf_utils.py", line 730, in from_pretrained
    model(model.dummy_inputs, training=False)  # build the network with dummy inputs
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py", line 968, in __call__
    outputs = self.call(cast_inputs, *args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/aspect_based_sentiment_analysis/models.py", line 150, in call
    logits = self.classifier(pooled_output)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py", line 964, in __call__
    self._maybe_build(inputs)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py", line 2398, in _maybe_build
    self.input_spec, inputs, self.name)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/engine/input_spec.py", line 196, in assert_input_compatibility
    str(x.shape.as_list()))
ValueError: Input 0 of layer classifier is incompatible with the layer: : expected min_ndim=2, found ndim=0. Full shape received: []

aspect-based-sentiment-analysis==2.0.1
tensorflow==2.2.0

tried different tf version but issue same

'BertConfig' object has no attribute 'num_polarities'

First I want to thank you for your outstanding contributions. You did an amazing work.

I'm trying to load a pre-trained model. However I get the error in the title. Do you have any ideas?

Read me

In the supervising model predictions in readme in the final sentence you talk about
There are a lot of articles that illustrate various concerns why drawing conclusions about model reasoning directly from attentions might be misleading. In the article [here], we validate and analyse explanations in detail.

The article regarding the attention analysis is not hyperlinked, is it possible to share the link that would be really insightful?
Thanks for your help!

InvalidArgumentError when making predictions

I get the following error when making predictions on some news articles:

  File "/home/vlad/anaconda3/envs/data-science/lib/python3.8/site-packages/aspect_based_sentiment_analysis/pipelines.py", line 208, in __call__
    predictions = self.transform(task.examples)
  File "/home/vlad/anaconda3/envs/data-science/lib/python3.8/site-packages/aspect_based_sentiment_analysis/pipelines.py", line 224, in transform
    output_batch = self.predict(input_batch)
  File "/home/vlad/anaconda3/envs/data-science/lib/python3.8/site-packages/aspect_based_sentiment_analysis/pipelines.py", line 251, in predict
    logits, hidden_states, attentions = self.model.call(
  File "/home/vlad/anaconda3/envs/data-science/lib/python3.8/site-packages/aspect_based_sentiment_analysis/models.py", line 141, in call
    outputs = self.bert(
  File "/home/vlad/anaconda3/envs/data-science/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py", line 985, in __call__
    outputs = call_fn(inputs, *args, **kwargs)
  File "/home/vlad/anaconda3/envs/data-science/lib/python3.8/site-packages/transformers/modeling_tf_bert.py", line 601, in call
    embedding_output = self.embeddings(input_ids, position_ids, token_type_ids, inputs_embeds, training=training)
  File "/home/vlad/anaconda3/envs/data-science/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py", line 985, in __call__
    outputs = call_fn(inputs, *args, **kwargs)
  File "/home/vlad/anaconda3/envs/data-science/lib/python3.8/site-packages/transformers/modeling_tf_bert.py", line 159, in call
    return self._embedding(input_ids, position_ids, token_type_ids, inputs_embeds, training=training)
  File "/home/vlad/anaconda3/envs/data-science/lib/python3.8/site-packages/transformers/modeling_tf_bert.py", line 185, in _embedding
    position_embeddings = tf.cast(self.position_embeddings(position_ids), inputs_embeds.dtype)
  File "/home/vlad/anaconda3/envs/data-science/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py", line 985, in __call__
    outputs = call_fn(inputs, *args, **kwargs)
  File "/home/vlad/anaconda3/envs/data-science/lib/python3.8/site-packages/tensorflow/python/keras/layers/embeddings.py", line 189, in call
    out = embedding_ops.embedding_lookup_v2(self.embeddings, inputs)
  File "/home/vlad/anaconda3/envs/data-science/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
    return target(*args, **kwargs)
  File "/home/vlad/anaconda3/envs/data-science/lib/python3.8/site-packages/tensorflow/python/ops/embedding_ops.py", line 394, in embedding_lookup_v2
    return embedding_lookup(params, ids, "div", name, max_norm=max_norm)
  File "/home/vlad/anaconda3/envs/data-science/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
    return target(*args, **kwargs)
  File "/home/vlad/anaconda3/envs/data-science/lib/python3.8/site-packages/tensorflow/python/ops/embedding_ops.py", line 322, in embedding_lookup
    return _embedding_lookup_and_transform(
  File "/home/vlad/anaconda3/envs/data-science/lib/python3.8/site-packages/tensorflow/python/ops/embedding_ops.py", line 138, in _embedding_lookup_and_transform
    array_ops.gather(params[0], ids, name=name), ids, max_norm)
  File "/home/vlad/anaconda3/envs/data-science/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
    return target(*args, **kwargs)
  File "/home/vlad/anaconda3/envs/data-science/lib/python3.8/site-packages/tensorflow/python/ops/array_ops.py", line 4676, in gather
    return params.sparse_read(indices, name=name)
  File "/home/vlad/anaconda3/envs/data-science/lib/python3.8/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 687, in sparse_read
    value = gen_resource_variable_ops.resource_gather(
  File "/home/vlad/anaconda3/envs/data-science/lib/python3.8/site-packages/tensorflow/python/ops/gen_resource_variable_ops.py", line 556, in resource_gather
    _ops.raise_from_not_ok_status(e, name)
  File "/home/vlad/anaconda3/envs/data-science/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 6843, in raise_from_not_ok_status
    six.raise_from(core._status_to_exception(e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0,560] = 560 is not in [0, 512) [Op:ResourceGather]

My code:

nlp_spacy = spacy.load('en_core_web_sm')
recognizer = absa.aux_models.BasicPatternRecognizer()
nlp_absa = absa.load('absa/classifier-rest-0.2.1', pattern_recognizer=recognizer)
completed_task = nlp_absa(text=text, aspects=[ent.text for ent in nlp_spacy(text).ents])
found = completed_task.examples
print(found)

Any ideas how I could get around it?

Indexing error in pipeline module for long sentences

Getting the following error for long sentences.

The python code used is as below:

import aspect_based_sentiment_analysis as absa
nlp = absa.load()
sentence = "...."
aspects = [....]
outputs = nlp(sentence, aspects=aspects)

Error:

IndexError: indices[0,512] = 512 is not in [0, 512) [Op:ResourceGather]

scalaconsultants / aspect-based-sentiment-analysis Goto Github PK

aspect-based-sentiment-analysis's Issues

In Jupyter it gives an error as below:

Recommend Projects

Recommend Topics

Recommend Org