When trying to run
python main.py --do_train --do_eval
I obtain a CUDA out of memory error. Is there any way to fix this?
I am a student experimenting with Relation Extraction models and do not have extra memory at my disposal.
Full error trace:
File "main.py", line 65, in
main(args)
File "main.py", line 18, in main
trainer.train()
File "C:\Users\Famke\R-BERT-master\trainer.py", line 84, in train
outputs = self.model(**inputs)
File "C:\Users\Famke\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Famke\R-BERT-master\model.py", line 58, in forward
token_type_ids=token_type_ids) # sequence_output, pooled_output, (hidden_states), (attentions)
File "C:\Users\Famke\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Famke\Anaconda3\lib\site-packages\transformers\modeling_bert.py", line 790, in forward
encoder_attention_mask=encoder_extended_attention_mask,
File "C:\Users\Famke\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Famke\Anaconda3\lib\site-packages\transformers\modeling_bert.py", line 407, in forward
hidden_states, attention_mask, head_mask[i], encoder_hidden_states, encoder_attention_mask
File "C:\Users\Famke\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Famke\Anaconda3\lib\site-packages\transformers\modeling_bert.py", line 368, in forward
self_attention_outputs = self.attention(hidden_states, attention_mask, head_mask)
File "C:\Users\Famke\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Famke\Anaconda3\lib\site-packages\transformers\modeling_bert.py", line 314, in forward
hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask
File "C:\Users\Famke\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Famke\Anaconda3\lib\site-packages\transformers\modeling_bert.py", line 234, in forward
attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))
RuntimeError: CUDA out of memory. Tried to allocate 18.00 MiB (GPU 0; 4.00 GiB total capacity; 3.06 GiB already allocated; 2.57 MiB free; 3.11 GiB reserved in total by PyTorch)
Epoch: 0%| | 0/5 [00:01<?, ?it/s]
Iteration: 0%| | 0/500 [00:01<?, ?it/s]