Git Product home page Git Product logo

r-bert's Introduction

🚀 Things I do

  • NLP Engineer, contributing on Korean NLP with Open Source!

📬 Find me at

Linkedin Badge Gmail Badge Tech Blog Badge

r-bert's People

Contributors

monologg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

r-bert's Issues

Question about F1 results

Hello, thanks for your works.
I got the final F1 result 82.0% after 5 epoch training while 89.25% in the paper. What about you?

CUDA out of memory

When trying to run

python main.py --do_train --do_eval

I obtain a CUDA out of memory error. Is there any way to fix this?

I am a student experimenting with Relation Extraction models and do not have extra memory at my disposal.

Full error trace:
File "main.py", line 65, in
main(args)
File "main.py", line 18, in main
trainer.train()
File "C:\Users\Famke\R-BERT-master\trainer.py", line 84, in train
outputs = self.model(**inputs)
File "C:\Users\Famke\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Famke\R-BERT-master\model.py", line 58, in forward
token_type_ids=token_type_ids) # sequence_output, pooled_output, (hidden_states), (attentions)
File "C:\Users\Famke\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Famke\Anaconda3\lib\site-packages\transformers\modeling_bert.py", line 790, in forward
encoder_attention_mask=encoder_extended_attention_mask,
File "C:\Users\Famke\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Famke\Anaconda3\lib\site-packages\transformers\modeling_bert.py", line 407, in forward
hidden_states, attention_mask, head_mask[i], encoder_hidden_states, encoder_attention_mask
File "C:\Users\Famke\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Famke\Anaconda3\lib\site-packages\transformers\modeling_bert.py", line 368, in forward
self_attention_outputs = self.attention(hidden_states, attention_mask, head_mask)
File "C:\Users\Famke\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Famke\Anaconda3\lib\site-packages\transformers\modeling_bert.py", line 314, in forward
hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask
File "C:\Users\Famke\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Famke\Anaconda3\lib\site-packages\transformers\modeling_bert.py", line 234, in forward
attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))
RuntimeError: CUDA out of memory. Tried to allocate 18.00 MiB (GPU 0; 4.00 GiB total capacity; 3.06 GiB already allocated; 2.57 MiB free; 3.11 GiB reserved in total by PyTorch)
Epoch: 0%| | 0/5 [00:01<?, ?it/s]
Iteration: 0%| | 0/500 [00:01<?, ?it/s]

Custom dataset with different relations

Hi,
I'm trying to train R-BERT on my dataset with different relations using your repo. I reformatted my dataset to be in your format for semeval, but when I run the train eval command I get an error because the evaluator tries to read from the result file and finds it empty.

When I looked at the eval directory before training, I found two files, I understand that the answer_keys file in used during evaluating model performance, I constructed my own answer_keys file.

About the second file, I changed the if part that contains relations for semeval to my relations. I still got the same error:
Evaluating: 100% 1178/1178 [02:04<00:00, 9.48it/s]
Bad file format on line 37675: '155333 date of birth(e1, e2)'
Iteration: 3% 249/7354 [03:24<1:37:08, 1.22it/s]
Epoch: 0% 0/10 [03:24<?, ?it/s]
Traceback (most recent call last):
File "main.py", line 117, in
main(args)
File "main.py", line 19, in main
trainer.train()
File "/content/drive/MyDrive/R-BERT/trainer.py", line 125, in train
self.evaluate("test") # There is no dev set for semeval task
File "/content/drive/MyDrive/R-BERT/trainer.py", line 192, in evaluate
result = compute_metrics(preds, out_label_ids)
File "/content/drive/MyDrive/R-BERT/utils.py", line 54, in compute_metrics
return acc_and_f1(preds, labels)
File "/content/drive/MyDrive/R-BERT/utils.py", line 65, in acc_and_f1
"f1": official_f1(),
File "/content/drive/MyDrive/R-BERT/official_eval.py", line 17, in official_f1
macro_result = list(f)[-1]
IndexError: list index out of range

The F1-Score decrease after the last update.

This is a good job. Thank you.
And
I remember the f1-score > 88 some days ago, but the f1-score < 88 I ran last night.
Because the entities fully-connected layer use the same weight after the update?

Could you share how to adjust parameters such as the learning rate to get more results?
Because I want to use R-BERT in my dataset, but the result is not very well.
Thanks.

FClayer for two entities

Hi,

In the paper, FClayers for two entities share the same parameters. However, in model.py file, there are two FClayers with different parameters. I want to ask about the performance of these two settings. Thanks.

Segmentation fault (core dumped)

07/01/2020 19:31:38 - INFO - data_loader - Loading features from cached file ./data/cached_train_semeval_bert-base-uncased_384
Segmentation fault (core dumped)

I run this model without any changes.

Some model files might be missing...

06/21/2020 10:13:15 - INFO - transformers.modeling_utils - loading weights file ./model\pytorch_model.bin
Traceback (most recent call last):
File "D:\WorkSpace\Pythonspace\R-BERT-master\trainer.py", line 200, in load_model
self.model = self.model_class.from_pretrained(self.args.model_dir)
File "D:\Programming\Anaconda3\lib\site-packages\transformers\modeling_utils.py", line 512, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
TypeError: init() missing 1 required positional argument: 'args'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:/WorkSpace/Pythonspace/R-BERT-master/main.py", line 63, in
main(args)
File "D:/WorkSpace/Pythonspace/R-BERT-master/main.py", line 21, in main
trainer.load_model()
File "D:\WorkSpace\Pythonspace\R-BERT-master\trainer.py", line 204, in load_model
raise Exception("Some model files might be missing...")
Exception: Some model files might be missing...

how can i train the R-bert in chinese dataset?

Excuse me, when I train the R-bert in chinese dataset, it shows the error-->'list index out of range',because the result.txt in eval is empty, so i want to know if the official_eval.py can test the chinese data?
if not, could u give me some advices?
Thanks a lot!!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.