monologg / r-bert Goto Github PK

Pytorch implementation of R-BERT: "Enriching Pre-trained Language Model with Entity Information for Relation Classification"

License: Apache License 2.0

Python 69.12% Perl 30.88%

bert relation-extraction bert-relation-extraction pytorch relation relation-bert relation-classification

r-bert's Introduction

🚀 Things I do

NLP Engineer, contributing on Korean NLP with Open Source!

📬 Find me at

r-bert's People

Contributors

Stargazers

Watchers

r-bert's Issues

Question about F1 results

Hello, thanks for your works.
I got the final F1 result 82.0% after 5 epoch training while 89.25% in the paper. What about you?

看到效果达到89.69%的关系抽取源码实现，推荐一波 https://github.com/DongPoLI/EC-BERT

How can i train my own dataset?

为什么执行了python3 main.py --do_train --do_eval后result.txt文件是空的？

你好：
请教一下，为什么执行了python3 main.py --do_train --do_eval之后，result.txt 文件是空的，请问这是什么原因呢？谢谢

CUDA out of memory

When trying to run

python main.py --do_train --do_eval

I obtain a CUDA out of memory error. Is there any way to fix this?

I am a student experimenting with Relation Extraction models and do not have extra memory at my disposal.

Full error trace:
File "main.py", line 65, in
main(args)
File "main.py", line 18, in main
trainer.train()
File "C:\Users\Famke\R-BERT-master\trainer.py", line 84, in train
outputs = self.model(**inputs)
File "C:\Users\Famke\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Famke\R-BERT-master\model.py", line 58, in forward
token_type_ids=token_type_ids) # sequence_output, pooled_output, (hidden_states), (attentions)
File "C:\Users\Famke\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Famke\Anaconda3\lib\site-packages\transformers\modeling_bert.py", line 790, in forward
encoder_attention_mask=encoder_extended_attention_mask,
File "C:\Users\Famke\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Famke\Anaconda3\lib\site-packages\transformers\modeling_bert.py", line 407, in forward
hidden_states, attention_mask, head_mask[i], encoder_hidden_states, encoder_attention_mask
File "C:\Users\Famke\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Famke\Anaconda3\lib\site-packages\transformers\modeling_bert.py", line 368, in forward
self_attention_outputs = self.attention(hidden_states, attention_mask, head_mask)
File "C:\Users\Famke\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Famke\Anaconda3\lib\site-packages\transformers\modeling_bert.py", line 314, in forward
hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask
File "C:\Users\Famke\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Famke\Anaconda3\lib\site-packages\transformers\modeling_bert.py", line 234, in forward
attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))
RuntimeError: CUDA out of memory. Tried to allocate 18.00 MiB (GPU 0; 4.00 GiB total capacity; 3.06 GiB already allocated; 2.57 MiB free; 3.11 GiB reserved in total by PyTorch)
Epoch: 0%| | 0/5 [00:01<?, ?it/s]
Iteration: 0%| | 0/500 [00:01<?, ?it/s]

Custom dataset with different relations

Hi,
I'm trying to train R-BERT on my dataset with different relations using your repo. I reformatted my dataset to be in your format for semeval, but when I run the train eval command I get an error because the evaluator tries to read from the result file and finds it empty.

When I looked at the eval directory before training, I found two files, I understand that the answer_keys file in used during evaluating model performance, I constructed my own answer_keys file.

About the second file, I changed the if part that contains relations for semeval to my relations. I still got the same error:
Evaluating: 100% 1178/1178 [02:04<00:00, 9.48it/s]
Bad file format on line 37675: '155333 date of birth(e1, e2)'
Iteration: 3% 249/7354 [03:24<1:37:08, 1.22it/s]
Epoch: 0% 0/10 [03:24<?, ?it/s]
Traceback (most recent call last):
File "main.py", line 117, in
main(args)
File "main.py", line 19, in main
trainer.train()
File "/content/drive/MyDrive/R-BERT/trainer.py", line 125, in train
self.evaluate("test") # There is no dev set for semeval task
File "/content/drive/MyDrive/R-BERT/trainer.py", line 192, in evaluate
result = compute_metrics(preds, out_label_ids)
File "/content/drive/MyDrive/R-BERT/utils.py", line 54, in compute_metrics
return acc_and_f1(preds, labels)
File "/content/drive/MyDrive/R-BERT/utils.py", line 65, in acc_and_f1
"f1": official_f1(),
File "/content/drive/MyDrive/R-BERT/official_eval.py", line 17, in official_f1
macro_result = list(f)[-1]
IndexError: list index out of range

Mul-BERT 在 the SemEval 2010 Task 8 dataset 到达 90.72 (Macro-F1）方法非常简单，

https://github.com/DongPoLI/Mul-BERT

The question is that the information of e1 and e2 is obtained in predict.py, but predict.py does not seem to be used in train .

运行predict.py出现错误：FileNotFoundError: [Errno 2] No such file or directory: './model\\training_args.bin'

The F1-Score decrease after the last update.

This is a good job. Thank you.
And
I remember the f1-score > 88 some days ago, but the f1-score < 88 I ran last night.
Because the entities fully-connected layer use the same weight after the update?

Could you share how to adjust parameters such as the learning rate to get more results?
Because I want to use R-BERT in my dataset, but the result is not very well.
Thanks.

FClayer for two entities

Hi,

In the paper, FClayers for two entities share the same parameters. However, in model.py file, there are two FClayers with different parameters. I want to ask about the performance of these two settings. Thanks.

Segmentation fault (core dumped)

07/01/2020 19:31:38 - INFO - data_loader - Loading features from cached file ./data/cached_train_semeval_bert-base-uncased_384
Segmentation fault (core dumped)

I run this model without any changes.

Some model files might be missing...

06/21/2020 10:13:15 - INFO - transformers.modeling_utils - loading weights file ./model\pytorch_model.bin
Traceback (most recent call last):
File "D:\WorkSpace\Pythonspace\R-BERT-master\trainer.py", line 200, in load_model
self.model = self.model_class.from_pretrained(self.args.model_dir)
File "D:\Programming\Anaconda3\lib\site-packages\transformers\modeling_utils.py", line 512, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
TypeError: init() missing 1 required positional argument: 'args'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:/WorkSpace/Pythonspace/R-BERT-master/main.py", line 63, in
main(args)
File "D:/WorkSpace/Pythonspace/R-BERT-master/main.py", line 21, in main
trainer.load_model()
File "D:\WorkSpace\Pythonspace\R-BERT-master\trainer.py", line 204, in load_model
raise Exception("Some model files might be missing...")
Exception: Some model files might be missing...

How can I retrain the model from the past trained model ?

As above
Thanks Park.

how can i train the R-bert in chinese dataset?

Excuse me, when I train the R-bert in chinese dataset, it shows the error-->'list index out of range',because the result.txt in eval is empty, so i want to know if the official_eval.py can test the chinese data?
if not, could u give me some advices?
Thanks a lot!!

Training on a custom dataset

When I run the training command this error shows up, do you have any idea what could cause it?

monologg / r-bert Goto Github PK

r-bert's Introduction

🚀 Things I do

📬 Find me at

r-bert's People

Contributors

Stargazers

Watchers

Forkers

r-bert's Issues

Recommend Projects

Recommend Topics

Recommend Org