lixin4ever / bert-e2e-absa Goto Github PK
View Code? Open in Web Editor NEW[EMNLP 2019 Workshop] Exploiting BERT for End-to-End Aspect-based Sentiment Analysis
Home Page: https://arxiv.org/abs/1910.00883
License: Apache License 2.0
[EMNLP 2019 Workshop] Exploiting BERT for End-to-End Aspect-based Sentiment Analysis
Home Page: https://arxiv.org/abs/1910.00883
License: Apache License 2.0
Hi there, thanks for maintaining this great model :)
I have not had any issues running the model using the default --model_name_or_path bert-base-uncased
option in train.sh
. However, I receieved an error when I tried running with the bert-large-uncased
option.
This is the contents of my train.sh
file:
#!/usr/bin/env bash
TASK_NAME=laptop14
ABSA_TYPE=san
CUDA_VISIBLE_DEVICES=0,2,3 python main.py --model_type bert \
--absa_type ${ABSA_TYPE} \
--tfm_mode finetune \
--fix_tfm 0 \
--model_name_or_path bert-large-uncased \
--data_dir ./data/${TASK_NAME} \
--task_name ${TASK_NAME} \
--per_gpu_train_batch_size 16 \
--per_gpu_eval_batch_size 8 \
--learning_rate 2e-5 \
--do_train \
--do_eval \
--do_lower_case \
--tagging_schema BIEOS \
--overfit 0 \
--overwrite_output_dir \
--eval_all_checkpoints \
--MASTER_ADDR localhost \
--MASTER_PORT 28512 \
--max_steps 1500
I also changed nhead
from 14 to 16
in this line in accordance with bert-large-uncased
.
When running train.sh
I receive the following traceback:
Traceback (most recent call last): | 0/172 [00:00<?, ?it/s]
File "main.py", line 542, in <module>
main()
File "main.py", line 445, in main
global_step, tr_loss = train(args, train_dataset, model, tokenizer)
File "main.py", line 205, in train
ouputs = model(**inputs)
File "/home/jupyter/BERT-E2E-ABSA/BERT-E2E-ABSA/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/home/jupyter/BERT-E2E-ABSA/absa_layer.py", line 461, in forward
logits = self.classifier(classifier_input)
File "/home/jupyter/BERT-E2E-ABSA/BERT-E2E-ABSA/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/home/jupyter/BERT-E2E-ABSA/BERT-E2E-ABSA/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 87, in forward
return F.linear(input, self.weight, self.bias)
File "/home/jupyter/BERT-E2E-ABSA/BERT-E2E-ABSA/lib/python3.7/site-packages/torch/nn/functional.py", line 1371, in linear
output = input.matmul(weight.t())
RuntimeError: size mismatch, m1: [1392 x 1024], m2: [768 x 14] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:752
Any ideas on what is going wrong here? Do I need to change anything else in the code in order to use the bert-large-uncased
model? Thank you!
问题已解决
感谢作者的贡献以及共享
对于该代码的 17行是否应该改成
原始代码>> if cur_ts_tag == 'O' or cur_ts_tag == 'EQ':
新代码 assert cur_ts_tag != 'EQ' #(OT序列的话只有O,T-SENTIMENT 不会有EQ)
if cur_ts_tag == 'O' :
对于该代码的 114行是否应该这样考虑:
原始代码: if ts_tag == 'O' or ts_tag == 'EQ':
新代码:
if ts_tag == 'EQ': continue #(EQ是BERT的wordpiece出来的,应该不考虑到序列中,不应该用O代替)
if ts_tag == 'O' : 后续内容不变
以上是两个关于代码的小疑问,感谢作者解答。
祝好
First of all, great work.
I wanted to do a small Inquiry about the tag 'EQ' in all the tagging schemes. May I know what that actually means? I didn't see anything related to it in the papers mentioned in the README. I also see that in the ot2bieos_ts()
in the seq_utils.py
the tag 'EQ' is replaced with an 'O'. Could you enlighten me on why this is?
How to test the statement view, for example, I input "this movie is very good", and feedback positive
Hi,
In your paper, u used a bidirectional GRU layer or one-direction layer? In this code I can see that bidirectional is set as true by default
Thank u
Hi,
when I try to run the fast_run.py even though I have every requirement already installed, I get the following error:
File "main.py", line 11, in
from transformers import AdamW, WarmupLinearSchedule
ImportError: cannot import name 'WarmupLinearSchedule' from 'transformers' (/Users/doruk/opt/anaconda3/lib/python3.8/site-packages/transformers/init.py)
!git clone https://github.com/lixin4ever/BERT-E2E-ABSA.git
!pip install transformers
!pip install tensorboardX
!python fast_run.py
I'm trying to run your model on colab with the previous steps, but it throws the following error:
2021-07-29 14:43:21.980960: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
07/29/2021 14:43:23 - WARNING - main - Process rank: -1, device: cpu, n_gpu: 0, distributed training: False, 16-bits training: False
Traceback (most recent call last):
File "main.py", line 542, in
main()
File "main.py", line 432, in main
config=config, cache_dir='/content/.cache')
File "/usr/local/lib/python3.7/dist-packages/transformers/modeling_utils.py", line 1357, in from_pretrained
_fast_init=_fast_init,
File "/usr/local/lib/python3.7/dist-packages/transformers/modeling_utils.py", line 1455, in _load_state_dict_into_model
model._init_weights(module)
File "/usr/local/lib/python3.7/dist-packages/transformers/modeling_utils.py", line 579, in _init_weights
raise NotImplementedError(f"Make sure _init_weigths
is implemented for {self.class}")
NotImplementedError: Make sure _init_weigths
is implemented for <class 'absa_layer.BertABSATagger'>
how can I solve this?
Hi,
Thank u for sharing ur code with us.
I have a question about predictions. As I read in ur paper, u reported a single result for these two tasks. However, is it possible to return evaluation scores for each task separately (As u did in this code https://github.com/lixin4ever/E2E-TBSA)? So we can compare this model against single-task models?
Thank u
Hi @lixin4ever ,Thank you for this amazing work,I have a query on the preparing the raw data for using on the trained model.
If I want to have tags for this data " Service is good,ambience is bad". How can I get this tag with the sentiment along with sentiment(T-POS,T-NEU or NEG).
Like I'm able to get the normal tags like [ B-ORG] using spacy, but to get the data in this format what is the procedure.
Will be very useful if you can help me on this.
Thanks
Hi!
Is there a way to disable this tagging part of the input text file? Is there a way to modify the code so that the input data doesn't require the part after '####' ? (For the interference part)
Line 442 in cf4ddc6
I wonder to know why you tranform tagger_input
into self.classifier when absa_type==crf
? That is, you don't use self.tagger=CRF()
when "absa_type==crf".
Thanks you for releasing your code. I have been able to train a model using my own dataset. I now want to use the trained model to predict unseen data. My question is what format should the data have (which does not have any annotations), and what flags should I set when only doing predictions using a trained model?
Thanks very much, Best
您好!
我是一名NLP方向的学生,最近阅读了您的这篇论文,觉得非常棒,所以目前正在复制您的方法和结果,后续希望能够应用在别的数据集上。
我注意到你的方法和源代码主要是用来解决情感分类问题(积极,消极和中立)。因为我后续想使用连续型数值的情感score(-1,1)预测,所以想问问,若在您的方法上做一下小变动,能不能实现score类型的prediction。
谢谢!
Hi,
I want to use the output variable which stores the aspect, sentiment pairs of the input sentence. I think it is the output_ts variable in work.py
I actually want to use it in a different .py file so I import the work.py like a module and I try to use output_ts there. The error I encounter is that work.py doesn't have an attribute named "output_ts". So I tried to return output_ts after the predict function inside work.py but that also didn't work. Is there a way to solve this issue?
Hey, great work!
I was trying to analyze the sentences where the model produces it incorrectly. However, the sentences have been converted into a matrix of features in glue_utils and later loss has been calculated. Could you provide me the step by step solution of how to just print the observed output , the predicted output, and the corresponding aspects?
Thanking you in advance
I don't quite understand what "local_rank" means in the code, could you explain it please? Thank you.
From what I understand based on the official paper, the approach used in this repository is trying to predict the following sequence of tags based on the input sentence:
The train.txt files on the data folder are used for training the model to classify such sequence. I also noticed that each line in the file consist of both (!) sentence sequence as well as (2) tag sequence which is separated by "####". Regarding this, I have several questions:
Hi @lixin4ever I appreciate your work. Thank you for that.
I had a query, like you have provided dataset for laptop and rest_total.
Similarly I want to prepare dataset for twitter.
Can you help me how to proceed with preparing the dataset with appropriate label like defined in get_labels function in glue_utils.py.
Means how can i label them?
Do I need cached data also ? If yes, how I can Prepare that ?
您好,我看到在之前的一篇A Unified Model for Opinion Target Extraction and Target Sentiment Prediction中,您通过Target Boundary Guided TBSA和Maintaining Sentiment Consistency去指导模型避免输出不一致的情感标签或者BIO标签。但是在这篇BERT+不同下游模型的组合中,似乎没有看到使用类似的机制。
请问这些机制在BERT后的下游模型中是不需要考虑的吗?是否基本不会出现类似情感一致性的问题呢?
Hi,
I am trying to upload the finetune data to my forked repository (I am using Git Large File Storage since the file is bigger than 10MB) but I get an error saying that I can not upload new objects to public fork.
I think the reason behind this error is that there are no LSF objects in the original repository. How can I solve this?
Hi,
I want to extract Sentiment directly for the given sentence, Is it possible to do this in current scenario of repo ? Or to attain this task we have to train the model first and then perform make some changes win work.py file to get out the predictions ?
Thanks for the updated codes.
你好,我想请教一下你,就是论文里用到的例子是用BIOES,即The=O AMD=B-POS Turin=I-POS Processor=E-POS seems=O to=O perform=O better=O than=O Intel=S-NEG ;然后看到数据集里用到的数据集The=O Mountain=T-POS Lion=T-POS OS=T-POS is=O not=O hard=O to=O figure=O out=O if=O you=O are=O familiar=O with=O Microsoft=T-NEU Windows=T-NEU .=O 都是以T开头?不应该也是BIOES格式吗?
closed
Hi,
When performing inference on around 500k sentences, speed starts very fast, first 50k sentences are done in around 30 minutes on Titan V. However, afterwarts speed drops significantly, and it takes around 10 hours for the next 40k sentences, with an estimate of over 50h for the remaining sentences. I dont understand why this is.
When will this condition be satisfied,since it's impossible for the sentiments list to be bigger than one in this condition?
elif pos == 'B': beg = i if len(sentiments) > 1: # remove the effect of the noisy I-{POS,NEG,NEU} #print(sentiments) sentiments = [sentiments[-1]]
Connection error, and we cannot find the requested files in the cached path."
ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on.
Hi, thanks for the great repo!
Bert runs very well for me, however when switching to XLNET i constantly get this error.
error:
Traceback (most recent call last):
File "main.py", line 522, in <module>
main()
File "main.py", line 412, in main
config=config, cache_dir='./cache')
File "C:\Anaconda3\envs\py37_dev\lib\site-packages\transformers\modeling_utils.py", line 342, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
File "\BERT-E2E-ABSA\absa_layer.py", line 488, in __init__
self.tagger_config = xlnet_config.absa_tagger_config
AttributeError: 'XLNetConfig' object has no attribute 'absa_tagger_config'
heres my train.bat:
#!/usr/bin/env bash
set TASK_NAME="rest_total"
set ABSA_TYPE="tfm"
set CUDA_VISIBLE_DEVICES=0
python main.py --model_type xlnet ^
--absa_type %ABSA_TYPE% ^
--tfm_mode finetune ^
--fix_tfm 0 ^
--model_name_or_path xlnet-base-cased ^
--data_dir ./data/sents ^
--task_name %TASK_NAME% ^
--per_gpu_train_batch_size 16 ^
--per_gpu_eval_batch_size 8 ^
--learning_rate 2e-5 ^
--do_train ^
--do_eval ^
--tagging_schema BIEOS ^
--overfit 0 ^
--overwrite_output_dir ^
--eval_all_checkpoints ^
--MASTER_ADDR localhost ^
--MASTER_PORT 28512 ^
--max_steps 2000
am running transformers 2.0.0 as per the requirements.
>>> import transformers as t
>>> t.__version__
'2.0.0'
hi, quite appreciate your work, and I'm wondering how long it takes to train
Hi @lixin4ever
Does this model supports the functionality to EXTRACT ASPECT TERMS over fresh Restaurant or Laptop Review ? If Yes, Please guide through this.
I tried to run a trained model on a "large" set of book reviews (15 MB).
I prepared the file as if it were a test set, with all tokens labeled as "O".
I get the following error:
Load checkpoint ./bert-tfm-bookreviews-finetune/checkpoint-1200/pytorch_model.bin...
test class count: [0. 0. 0.]
***** Running prediction *****
Evaluating: 0%| | 0/69420 [00:00<?, ?it/s]
Traceback (most recent call last):
File "work.py", line 216, in <module>
main()
File "work.py", line 125, in main
predict(args, model, tokenizer)
File "work.py", line 161, in predict
outputs = model(**inputs)
File "/home/p286012/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/var/tmp/andreas/bookreviews-absa/BERT-E2E-ABSA/absa_layer.py", line 437, in forward
attention_mask=attention_mask, head_mask=head_mask)
File "/home/p286012/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/p286012/.local/lib/python3.7/site-packages/transformers/models/bert/modeling_bert.py", line 964, in forward
past_key_values_length=past_key_values_length,
File "/home/p286012/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/p286012/.local/lib/python3.7/site-packages/transformers/models/bert/modeling_bert.py", line 206, in forward
embeddings += position_embeddings
RuntimeError: The size of tensor a (1304) must match the size of tensor b (512) at non-singleton dimension 1
sh work-unlabeled.sh > 398.30s user 17.21s system 101% cpu 6:50.79 total
512 happens to be the limit of BERT, so maybe the input didn't get truncated correctly, but I used the default of a maximum of 128 tokens per sentence.
work-unlabeled.sh is basically the default, only I used cased BERT:
#!/usr/bin/env bash
TASK_NAME="bookreviews-goodreads_rest"
ABSA_HOME="./bert-tfm-bookreviews-finetune"
CUDA_VISIBLE_DEVICES=0 python work.py --absa_home ${ABSA_HOME} \
--ckpt ${ABSA_HOME}/checkpoint-1200 \
--model_type bert \
--data_dir ./data/${TASK_NAME} \
--task_name ${TASK_NAME} \
--model_name_or_path bert-base-cased \
--cache_dir ./cache \
--max_seq_length 128 \
--tagging_schema BIEOS
similarly for train.sh
#!/usr/bin/env bash
TASK_NAME=bookreviews
ABSA_TYPE=tfm
CUDA_VISIBLE_DEVICES=0,2,3 python main.py --model_type bert \
--absa_type ${ABSA_TYPE} \
--tfm_mode finetune \
--fix_tfm 0 \
--model_name_or_path bert-base-cased \
--data_dir ./data/${TASK_NAME} \
--task_name ${TASK_NAME} \
--per_gpu_train_batch_size 16 \
--per_gpu_eval_batch_size 8 \
--learning_rate 2e-5 \
--do_train \
--do_eval \
--tagging_schema BIEOS \
--overfit 0 \
--overwrite_output_dir \
--eval_all_checkpoints \
--MASTER_ADDR localhost \
--MASTER_PORT 28512 \
--max_steps 1500
Hi there,
Running work.sh on a tfm model works fine, but with crf I get the following error:
Load checkpoint ./bert-crf-bookreviews-finetune/checkpoint-1200/pytorch_model.bin...
cached_features_file: ./data/bookreviews/cached_test_bert-base-cased_128_bookreviews
***** Running prediction *****
Evaluating: 0%|▏ | 1/800 [00:00<01:51, 7.17it/s]
Traceback (most recent call last):
File "work.py", line 216, in <module>
main()
File "work.py", line 125, in main
predict(args, model, tokenizer)
File "work.py", line 200, in predict
total_preds = np.append(total_preds, preds, axis=0)
File "<__array_function__ internals>", line 6, in append
File "/home/p286012/.local/lib/python3.7/site-packages/numpy/lib/function_base.py", line 4745, in append
return concatenate((arr, values), axis=axis)
File "<__array_function__ internals>", line 6, in concatenate
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 12 and the array at index 1 has size 19
Any idea what's wrong? Training did work.
def get_labels(self, tagging_schema):
if tagging_schema == 'OT':
return []
elif tagging_schema == 'BIO':
return ['O', 'EQ', 'B-POS', 'I-POS', 'B-NEG', 'I-NEG', 'B-NEU', 'I-NEU']
elif tagging_schema == 'BIEOS':
return ['O', 'EQ', 'B-POS', 'I-POS', 'E-POS', 'S-POS',
'B-NEG', 'I-NEG', 'E-NEG', 'S-NEG',
'B-NEU', 'I-NEU', 'E-NEU', 'S-NEU']
else:
raise Exception("Invalid tagging schema %s..." % tagging_schema)
我想利用这个模型跑中文数据集,我已经将数据格式处理成和英文相同的,但是不知道在哪里更换bert预训练模型为chinese_bert模型?
Hi Li Xin,
Thank you for this repo !
I tried to replicate your result with the command 'python fast_run.py' and I encountered this problem.
It seems that this is a problem on transformer side. I tried to install it both by pip and from source (latest version 3.0.2) on Pytorch 1.2 (Ubuntu 20.04, Python 3.7, CUDA 11).
Did I miss something ? Do you have any idea why it happened ?
I searched over the internet and on forum hugging face but I do not see anyone evoke this problem.
Traceback (most recent call last):
File "main.py", line 12, in
from absa_layer import BertABSATagger, XLNetABSATagger
File "/home/hoa/BERT_E2E_ABSA/absa_layer.py", line 5, in
from bert import BertPreTrainedModel, XLNetPreTrainedModel
File "/home/hoa/BERT_E2E_ABSA/bert.py", line 19, in
from transformers import BERT_PRETRAINED_MODEL_ARCHIVE_MAP, BERT_PRETRAINED_CONFIG_ARCHIVE_MAP
ImportError: cannot import name 'BERT_PRETRAINED_MODEL_ARCHIVE_MAP' from 'transformers' (/home/hoa/transformers/src/transformers/init.py)
Knowing that Transformer is well installed as it can do the following:
python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('I hate you'))"
[{'label': 'NEGATIVE', 'score': 0.9991129040718079}]
Hi!
I was wondering if I can use the model, after training it, on new stuff to do the aspect extraction and also to find aspect sentiment polarity?
If yes how can I do it?
Thanks!
Where have you deployed the model ?
If you have not deployed the model, I can help you out.
Let's collaborate!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.