lixin4ever / bert-e2e-absa Goto Github PK

View Code? Open in Web Editor NEW

386.0 386.0 89.0 1.62 MB

[EMNLP 2019 Workshop] Exploiting BERT for End-to-End Aspect-based Sentiment Analysis

Home Page: https://arxiv.org/abs/1910.00883

License: Apache License 2.0

Python 98.64% Shell 1.36%

aspect-based-sentiment-analysis

bert-e2e-absa's People

Contributors

Stargazers

Watchers

Forkers

cxncu001 bigdatasciencegroup lucasrafaelc ahashisyuu yaowanwei96 merryjanejian yangdechuan mahdirahbar databill86 moneydboat thedutchdevil mysqlsc gao-lex jboru savvyyy eatsleepraverepeat linktopast1990 zheng5yu9 moreinterest jeremiah0425 ren1406 bennykuya whatyouknow123 jjun-01 skyewang itolpride ybm-nlp lavanaythakral ruidan johnson7788 yuriy-os liuwq168 csuxyl avivaqy yzmm10 yaserkl jessssy-yang ahmedrachid jesusmiguelgarcia ameliefo arnavc1712 worien shmazumder wishyouerehere9610 lrpopeyou geekbeing liuning123 invincible-sazzad kimbk209 wdimmy xiaobaile andreasvc batwayne-chen tzzz117 ascarathira chenhou31 llj110 zhanzq yingyingpeng hjh1213 thejanrupasinghe khannyasha icloudsong dafeng097 mainuliitkgp iswangby ziadelassal aterhim dorisfangwork mandyyang1989 cathy-z1900 xesaad yaoysyao daserif smiyawaki0820 nilzmoradi94 techthiyanes dionis moonckk iq-scm qianiqan1110 fani-lab tusharpaul01 tamanna18 3riplem shuxuehuang ccj211985

bert-e2e-absa's Issues

Running the model on bert-large-uncased

Hi there, thanks for maintaining this great model :)

I have not had any issues running the model using the default --model_name_or_path bert-base-uncased option in train.sh. However, I receieved an error when I tried running with the bert-large-uncased option.

Details

This is the contents of my train.sh file:

#!/usr/bin/env bash
TASK_NAME=laptop14
ABSA_TYPE=san
CUDA_VISIBLE_DEVICES=0,2,3 python main.py --model_type bert \
                         --absa_type ${ABSA_TYPE} \
                         --tfm_mode finetune \
                         --fix_tfm 0 \
                         --model_name_or_path bert-large-uncased \
                         --data_dir ./data/${TASK_NAME} \
                         --task_name ${TASK_NAME} \
                         --per_gpu_train_batch_size 16 \
                         --per_gpu_eval_batch_size 8 \
                         --learning_rate 2e-5 \
                         --do_train \
                         --do_eval \
                         --do_lower_case \
                         --tagging_schema BIEOS \
                         --overfit 0 \
                         --overwrite_output_dir \
                         --eval_all_checkpoints \
                         --MASTER_ADDR localhost \
                         --MASTER_PORT 28512 \
                         --max_steps 1500

I also changed nhead from 14 to 16 in this line in accordance with bert-large-uncased.
When running train.sh I receive the following traceback:

Traceback (most recent call last):                                                                                                                                                                               | 0/172 [00:00<?, ?it/s]
  File "main.py", line 542, in <module>                                                                                                                                                                                                 
    main()                                                                                                                                                                                                                              
  File "main.py", line 445, in main                                                                                                                                                                                                     
    global_step, tr_loss = train(args, train_dataset, model, tokenizer)                                                                                                                                                                 
  File "main.py", line 205, in train                                                                                                                                                                                                    
    ouputs = model(**inputs)                                                                                                                                                                                                            
  File "/home/jupyter/BERT-E2E-ABSA/BERT-E2E-ABSA/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__                                                                                                        
    result = self.forward(*input, **kwargs)                                                                                                                                                                                             
  File "/home/jupyter/BERT-E2E-ABSA/absa_layer.py", line 461, in forward                                                                                                                                                                
    logits = self.classifier(classifier_input)                                                                                                                                                                                          
  File "/home/jupyter/BERT-E2E-ABSA/BERT-E2E-ABSA/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__                                                                                                        
    result = self.forward(*input, **kwargs)                                                                                                                                                                                             
  File "/home/jupyter/BERT-E2E-ABSA/BERT-E2E-ABSA/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 87, in forward                                                                                                          
    return F.linear(input, self.weight, self.bias)                                                                                                                                                                                      
  File "/home/jupyter/BERT-E2E-ABSA/BERT-E2E-ABSA/lib/python3.7/site-packages/torch/nn/functional.py", line 1371, in linear                                                                                                             
    output = input.matmul(weight.t())                                                                                                                                                                                                   
RuntimeError: size mismatch, m1: [1392 x 1024], m2: [768 x 14] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:752

Any ideas on what is going wrong here? Do I need to change anything else in the code in order to use the bert-large-uncased model? Thank you!

关于数据集的标注方式

问题已解决
感谢作者的贡献以及共享

关于代码seq_util.py的几个疑问

对于该代码的 17行是否应该改成
原始代码>> if cur_ts_tag == 'O' or cur_ts_tag == 'EQ':
新代码 assert cur_ts_tag != 'EQ' #（OT序列的话只有O，T-SENTIMENT 不会有EQ）
if cur_ts_tag == 'O' ：

对于该代码的 114行是否应该这样考虑：
原始代码： if ts_tag == 'O' or ts_tag == 'EQ':
新代码：
if ts_tag == 'EQ': continue #（EQ是BERT的wordpiece出来的，应该不考虑到序列中，不应该用O代替）
if ts_tag == 'O' : 后续内容不变

以上是两个关于代码的小疑问，感谢作者解答。
祝好

About the tag 'EQ'

First of all, great work.

I wanted to do a small Inquiry about the tag 'EQ' in all the tagging schemes. May I know what that actually means? I didn't see anything related to it in the papers mentioned in the README. I also see that in the ot2bieos_ts() in the seq_utils.py the tag 'EQ' is replaced with an 'O'. Could you enlighten me on why this is?

How to test the statement view

How to test the statement view, for example, I input "this movie is very good", and feedback positive

跑测试集没有返回评估结果？

BERT-GRU model

Hi,

In your paper, u used a bidirectional GRU layer or one-direction layer? In this code I can see that bidirectional is set as true by default

Thank u

Problems with fast_run.py

Hi,

when I try to run the fast_run.py even though I have every requirement already installed, I get the following error:

File "main.py", line 11, in
from transformers import AdamW, WarmupLinearSchedule
ImportError: cannot import name 'WarmupLinearSchedule' from 'transformers' (/Users/doruk/opt/anaconda3/lib/python3.8/site-packages/transformers/init.py)

NotImplementedError: Make sure `_init_weigths` is implemented (Error on Google Colab)

!git clone https://github.com/lixin4ever/BERT-E2E-ABSA.git

!pip install transformers
!pip install tensorboardX
!python fast_run.py

I'm trying to run your model on colab with the previous steps, but it throws the following error:

2021-07-29 14:43:21.980960: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
07/29/2021 14:43:23 - WARNING - main - Process rank: -1, device: cpu, n_gpu: 0, distributed training: False, 16-bits training: False
Traceback (most recent call last):
File "main.py", line 542, in
main()
File "main.py", line 432, in main
config=config, cache_dir='/content/.cache')
File "/usr/local/lib/python3.7/dist-packages/transformers/modeling_utils.py", line 1357, in from_pretrained
_fast_init=_fast_init,
File "/usr/local/lib/python3.7/dist-packages/transformers/modeling_utils.py", line 1455, in _load_state_dict_into_model
model._init_weights(module)
File "/usr/local/lib/python3.7/dist-packages/transformers/modeling_utils.py", line 579, in _init_weights
raise NotImplementedError(f"Make sure _init_weigths is implemented for {self.class}")
NotImplementedError: Make sure _init_weigths is implemented for <class 'absa_layer.BertABSATagger'>

how can I solve this?

Evaluate ATE and ASC separetly

Hi,

Thank u for sharing ur code with us.
I have a question about predictions. As I read in ur paper, u reported a single result for these two tasks. However, is it possible to return evaluation scores for each task separately (As u did in this code https://github.com/lixin4ever/E2E-TBSA)? So we can compare this model against single-task models?

Thank u

Tags for using using raw data on the trained model

Hi @lixin4ever ,Thank you for this amazing work,I have a query on the preparing the raw data for using on the trained model.
If I want to have tags for this data " Service is good,ambience is bad". How can I get this tag with the sentiment along with sentiment(T-POS,T-NEU or NEG).
Like I'm able to get the normal tags like [ B-ORG] using spacy, but to get the data in this format what is the procedure.
Will be very useful if you can help me on this.

Thanks

Shortcut for tagging unseen/unlabeled data

Hi!

Is there a way to disable this tagging part of the input text file? Is there a way to modify the code so that the input data doesn't require the part after '####' ? (For the interference part)

Question about the situation when `absa_type==crf`

BERT-E2E-ABSA/absa_layer.py

Line 442 in cf4ddc6

if self.tagger is None or self.tagger_config.absa_type == 'crf':

I wonder to know why you tranform tagger_input into self.classifier when absa_type==crf? That is, you don't use self.tagger=CRF() when "absa_type==crf".

Format Novel Data

Thanks you for releasing your code. I have been able to train a model using my own dataset. I now want to use the trained model to predict unseen data. My question is what format should the data have (which does not have any annotations), and what flags should I set when only doing predictions using a trained model?

Thanks very much, Best

是否能将该方法应用于连续型数值？

您好！
我是一名NLP方向的学生，最近阅读了您的这篇论文，觉得非常棒，所以目前正在复制您的方法和结果，后续希望能够应用在别的数据集上。
我注意到你的方法和源代码主要是用来解决情感分类问题（积极，消极和中立）。因为我后续想使用连续型数值的情感score（-1，1）预测，所以想问问，若在您的方法上做一下小变动，能不能实现score类型的prediction。
谢谢！

Using the output variable (output_ts) of the data

Hi,

I want to use the output variable which stores the aspect, sentiment pairs of the input sentence. I think it is the output_ts variable in work.py

I actually want to use it in a different .py file so I import the work.py like a module and I try to use output_ts there. The error I encounter is that work.py doesn't have an attribute named "output_ts". So I tried to return output_ts after the predict function inside work.py but that also didn't work. Is there a way to solve this issue?

Retrieving the wrong predictions

Hey, great work!
I was trying to analyze the sentences where the model produces it incorrectly. However, the sentences have been converted into a matrix of features in glue_utils and later loss has been calculated. Could you provide me the step by step solution of how to just print the observed output , the predicted output, and the corresponding aspects?

Thanking you in advance

What does local rank mean in code

I don't quite understand what "local_rank" means in the code, could you explain it please? Thank you.

关于self-attention的一个小问题

您好博主想问一个问题

如图中的红框部分,,
src = src + self.dropout(src2)
这一部分是是要将原始token的embedding加上过完attention的吗
不能直接拿attention的输出src2来用作输出吗
加起来是更合理吗？

Question about dataset preprocessing

From what I understand based on the official paper, the approach used in this repository is trying to predict the following sequence of tags based on the input sentence:

The train.txt files on the data folder are used for training the model to classify such sequence. I also noticed that each line in the file consist of both (!) sentence sequence as well as (2) tag sequence which is separated by "####". Regarding this, I have several questions:

How did you annotate the original XML dataset into the current BIEOS/BIO/OT tagging scheme? Is there an open-source tool to easily annotate the tagging scheme to an unlabeled dataset??
How to preprocess the tag sequence from the train.txt to its appropriate format for model training?

How to prepare the dataset?

Hi @lixin4ever I appreciate your work. Thank you for that.
I had a query, like you have provided dataset for laptop and rest_total.
Similarly I want to prepare dataset for twitter.
Can you help me how to proceed with preparing the dataset with appropriate label like defined in get_labels function in glue_utils.py.
Means how can i label them?
Do I need cached data also ? If yes, how I can Prepare that ?

关于情感一致性和词边界的问题

您好，我看到在之前的一篇A Unified Model for Opinion Target Extraction and Target Sentiment Prediction中，您通过Target Boundary Guided TBSA和Maintaining Sentiment Consistency去指导模型避免输出不一致的情感标签或者BIO标签。但是在这篇BERT+不同下游模型的组合中，似乎没有看到使用类似的机制。

请问这些机制在BERT后的下游模型中是不需要考虑的吗？是否基本不会出现类似情感一致性的问题呢？

Cannot push LSF object to forked repo

Hi,

I am trying to upload the finetune data to my forked repository (I am using Git Large File Storage since the file is bigger than 10MB) but I get an error saying that I can not upload new objects to public fork.

I think the reason behind this error is that there are no LSF objects in the original repository. How can I solve this?

Aspect Sentiment Extraction Using the Trained Model

Hi,

I want to extract Sentiment directly for the given sentence, Is it possible to do this in current scenario of repo ? Or to attain this task we have to train the model first and then perform make some changes win work.py file to get out the predictions ?

Thanks for the updated codes.

数据集问题

你好，我想请教一下你，就是论文里用到的例子是用BIOES，即The=O AMD=B-POS Turin=I-POS Processor=E-POS seems=O to=O perform=O better=O than=O Intel=S-NEG ;然后看到数据集里用到的数据集The=O Mountain=T-POS Lion=T-POS OS=T-POS is=O not=O hard=O to=O figure=O out=O if=O you=O are=O familiar=O with=O Microsoft=T-NEU Windows=T-NEU .=O 都是以T开头？不应该也是BIOES格式吗？

closed

Inference Time

Hi,

When performing inference on around 500k sentences, speed starts very fast, first 50k sentences are done in around 30 minutes on Titan V. However, afterwarts speed drops significantly, and it takes around 10 hours for the next 40k sentences, with an estimate of over 50h for the remaining sentences. I dont understand why this is.

tag2ts problem

When will this condition be satisfied,since it's impossible for the sentiments list to be bigger than one in this condition?

elif pos == 'B': beg = i if len(sentiments) > 1: # remove the effect of the noisy I-{POS,NEG,NEU} #print(sentiments) sentiments = [sentiments[-1]]

"Connection error, and we cannot find the requested files in the cached path."

Connection error, and we cannot find the requested files in the cached path."
ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on.

AttributeError: 'XLNetConfig' object has no attribute 'absa_tagger_config'

Hi, thanks for the great repo!

Bert runs very well for me, however when switching to XLNET i constantly get this error.

error:

Traceback (most recent call last):
  File "main.py", line 522, in <module>
    main()
  File "main.py", line 412, in main
    config=config, cache_dir='./cache')
  File "C:\Anaconda3\envs\py37_dev\lib\site-packages\transformers\modeling_utils.py", line 342, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
  File "\BERT-E2E-ABSA\absa_layer.py", line 488, in __init__
    self.tagger_config = xlnet_config.absa_tagger_config
AttributeError: 'XLNetConfig' object has no attribute 'absa_tagger_config'

heres my train.bat:

#!/usr/bin/env bash
set TASK_NAME="rest_total"
set ABSA_TYPE="tfm"
set CUDA_VISIBLE_DEVICES=0 
python main.py --model_type xlnet ^
                         --absa_type %ABSA_TYPE% ^
                         --tfm_mode finetune ^
                         --fix_tfm 0 ^
                         --model_name_or_path xlnet-base-cased ^
                         --data_dir ./data/sents ^
                         --task_name %TASK_NAME% ^
                         --per_gpu_train_batch_size 16 ^
                         --per_gpu_eval_batch_size 8 ^
                         --learning_rate 2e-5 ^
                         --do_train ^
                         --do_eval ^
                         --tagging_schema BIEOS ^
                         --overfit 0 ^
                         --overwrite_output_dir ^
                         --eval_all_checkpoints ^
                         --MASTER_ADDR localhost ^
                         --MASTER_PORT 28512 ^
                         --max_steps 2000

am running transformers 2.0.0 as per the requirements.

>>> import transformers as t
>>> t.__version__
'2.0.0'

about training time

hi, quite appreciate your work, and I'm wondering how long it takes to train

Aspect Extraction on fresh Laptop or Restaurant Reviews

Hi @lixin4ever
Does this model supports the functionality to EXTRACT ASPECT TERMS over fresh Restaurant or Laptop Review ? If Yes, Please guide through this.

Error with work.sh on large set of unlabeled text

I tried to run a trained model on a "large" set of book reviews (15 MB).
I prepared the file as if it were a test set, with all tokens labeled as "O".
I get the following error:

Load checkpoint ./bert-tfm-bookreviews-finetune/checkpoint-1200/pytorch_model.bin...
test class count: [0. 0. 0.]
***** Running prediction *****
Evaluating:   0%|                                                                                                                                                    | 0/69420 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "work.py", line 216, in <module>
    main()
  File "work.py", line 125, in main
    predict(args, model, tokenizer)
  File "work.py", line 161, in predict
    outputs = model(**inputs)
  File "/home/p286012/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/var/tmp/andreas/bookreviews-absa/BERT-E2E-ABSA/absa_layer.py", line 437, in forward
    attention_mask=attention_mask, head_mask=head_mask)
  File "/home/p286012/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/p286012/.local/lib/python3.7/site-packages/transformers/models/bert/modeling_bert.py", line 964, in forward
    past_key_values_length=past_key_values_length,
  File "/home/p286012/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/p286012/.local/lib/python3.7/site-packages/transformers/models/bert/modeling_bert.py", line 206, in forward
    embeddings += position_embeddings
RuntimeError: The size of tensor a (1304) must match the size of tensor b (512) at non-singleton dimension 1
sh work-unlabeled.sh >   398.30s user 17.21s system 101% cpu 6:50.79 total

512 happens to be the limit of BERT, so maybe the input didn't get truncated correctly, but I used the default of a maximum of 128 tokens per sentence.

work-unlabeled.sh is basically the default, only I used cased BERT:

#!/usr/bin/env bash
TASK_NAME="bookreviews-goodreads_rest"
ABSA_HOME="./bert-tfm-bookreviews-finetune"
CUDA_VISIBLE_DEVICES=0 python work.py --absa_home ${ABSA_HOME} \
                      --ckpt ${ABSA_HOME}/checkpoint-1200 \
                      --model_type bert \
                      --data_dir ./data/${TASK_NAME} \
                      --task_name ${TASK_NAME} \
                      --model_name_or_path bert-base-cased \
                      --cache_dir ./cache \
                      --max_seq_length 128 \
                      --tagging_schema BIEOS

similarly for train.sh

#!/usr/bin/env bash
TASK_NAME=bookreviews
ABSA_TYPE=tfm
CUDA_VISIBLE_DEVICES=0,2,3 python main.py --model_type bert \
                         --absa_type ${ABSA_TYPE} \
                         --tfm_mode finetune \
                         --fix_tfm 0 \
                         --model_name_or_path bert-base-cased \
                         --data_dir ./data/${TASK_NAME} \
                         --task_name ${TASK_NAME} \
                         --per_gpu_train_batch_size 16 \
                         --per_gpu_eval_batch_size 8 \
                         --learning_rate 2e-5 \
                         --do_train \
                         --do_eval \
                         --tagging_schema BIEOS \
                         --overfit 0 \
                         --overwrite_output_dir \
                         --eval_all_checkpoints \
                         --MASTER_ADDR localhost \
                         --MASTER_PORT 28512 \
                         --max_steps 1500

Error with work.sh on CRF model

Hi there,
Running work.sh on a tfm model works fine, but with crf I get the following error:

Load checkpoint ./bert-crf-bookreviews-finetune/checkpoint-1200/pytorch_model.bin...
cached_features_file: ./data/bookreviews/cached_test_bert-base-cased_128_bookreviews
***** Running prediction *****
Evaluating:   0%|▏                                                                                                                                             | 1/800 [00:00<01:51,  7.17it/s]
Traceback (most recent call last):
  File "work.py", line 216, in <module>
    main()
  File "work.py", line 125, in main
    predict(args, model, tokenizer)
  File "work.py", line 200, in predict
    total_preds = np.append(total_preds, preds, axis=0)
  File "<__array_function__ internals>", line 6, in append
  File "/home/p286012/.local/lib/python3.7/site-packages/numpy/lib/function_base.py", line 4745, in append
    return concatenate((arr, values), axis=axis)
  File "<__array_function__ internals>", line 6, in concatenate
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 12 and the array at index 1 has size 19

Any idea what's wrong? Training did work.

why you return empty list when the tagging_schema is OT?

def get_labels(self, tagging_schema):

    if tagging_schema == 'OT':
        return []
    elif tagging_schema == 'BIO':
        return ['O', 'EQ', 'B-POS', 'I-POS', 'B-NEG', 'I-NEG', 'B-NEU', 'I-NEU']
    elif tagging_schema == 'BIEOS':
        return ['O', 'EQ', 'B-POS', 'I-POS', 'E-POS', 'S-POS',
        'B-NEG', 'I-NEG', 'E-NEG', 'S-NEG',
        'B-NEU', 'I-NEU', 'E-NEU', 'S-NEU']
    else:
        raise Exception("Invalid tagging schema %s..." % tagging_schema)

New data

Meaning of a parameter

Hi @lixin4ever

我想请问你下面这个图的参数 |Y| 代表什么啊，是指所有可能 tag 的个数吗？

祝好

中文模型

我想利用这个模型跑中文数据集，我已经将数据格式处理成和英文相同的，但是不知道在哪里更换bert预训练模型为chinese_bert模型？

NotImplementedError: Make sure `_init_weigths` is implemented for <class 'absa_layer.BertABSATagger'>啥情况

ImportError: cannot import name 'BERT_PRETRAINED_MODEL_ARCHIVE_MAP' from 'transformers'

Hi Li Xin,

Thank you for this repo !

I tried to replicate your result with the command 'python fast_run.py' and I encountered this problem.
It seems that this is a problem on transformer side. I tried to install it both by pip and from source (latest version 3.0.2) on Pytorch 1.2 (Ubuntu 20.04, Python 3.7, CUDA 11).
Did I miss something ? Do you have any idea why it happened ?
I searched over the internet and on forum hugging face but I do not see anyone evoke this problem.

Traceback (most recent call last):
File "main.py", line 12, in
from absa_layer import BertABSATagger, XLNetABSATagger
File "/home/hoa/BERT_E2E_ABSA/absa_layer.py", line 5, in
from bert import BertPreTrainedModel, XLNetPreTrainedModel
File "/home/hoa/BERT_E2E_ABSA/bert.py", line 19, in
from transformers import BERT_PRETRAINED_MODEL_ARCHIVE_MAP, BERT_PRETRAINED_CONFIG_ARCHIVE_MAP
ImportError: cannot import name 'BERT_PRETRAINED_MODEL_ARCHIVE_MAP' from 'transformers' (/home/hoa/transformers/src/transformers/init.py)

Knowing that Transformer is well installed as it can do the following:

python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('I hate you'))"
[{'label': 'NEGATIVE', 'score': 0.9991129040718079}]

How to use the trained model

Hi!

I was wondering if I can use the model, after training it, on new stuff to do the aspect extraction and also to find aspect sentiment polarity?

If yes how can I do it?

Thanks!

Deployment ?

Where have you deployed the model ?
If you have not deployed the model, I can help you out.
Let's collaborate!