shakeel608 / opennmt-py-with-bert Goto Github PK

OpenNMT Pytorch with BERT Embeddings

License: MIT License

Jupyter Notebook 3.67% Dockerfile 0.02% Python 84.49% Shell 4.06% Perl 4.46% Smalltalk 0.27% Emacs Lisp 2.40% JavaScript 0.12% NewLisp 0.22% Ruby 0.23% Slash 0.04% SystemVerilog 0.03%

opennmt-py-with-bert's People

Contributors

Stargazers

Forkers

batermj kolasamuel phychaos tranhoangnguyen03 zhangway100 donniezhang586 elliotthwang admariner karthim

opennmt-py-with-bert's Issues

Do you use Multilingual BERT? Are weights updated after training?

Hi,

I was wondering whether you use the multilingual BERT as a starter. Also, are weights are updated after NMT training? How does it deal with out of vocabulary words?

The principle for BERT-NMT?

Encoder replaced by BERT?
Or
Encoder and decoder both replaced by BERT?
Or?
Thank you very much!

Pre-training model for text summary

Who has a pre-training model used for text summaries, can you send it to me?

assert len(self) == len(inputs_) at \OpenNMT-py-with-BERT\onmt\modules\util_class.py", line 25, in forward

Hi.

I am using this module on custom data which converts english to specific pre-defined format. I created file as mentioned in read_me and did pre-processing as using preprocessing.py file.
However, when I am trying to train it using TRANSFORMERS using below command

train.py -data data/demo -save_model demo-model -layers 6 -rnn_size 512 -word_vec_size 512 -transformer_ff 2048 -heads 8 -encoder_type transformer -decoder_type transformer -position_encoding -train_steps 200000 -max_generator_batches 2 -dropout 0.1 -batch_size 4096 -batch_type tokens -normalization tokens -accum_count 2 -optim adam -adam_beta2 0.998 -decay_method noam -warmup_steps 8000 -learning_rate 2 -max_grad_norm 0 -param_init 0 -param_init_glorot -label_smoothing 0.1 -valid_steps 10000 -save_checkpoint_steps 10000 -world_size 4 -gpu_ranks 0 1 2 3

I am getting error as File OpenNMT-py-with-BERT\onmt\modules\util_class.py", line 25, in forward
assert len(self) == len(inputs_)

I am unable to understand where exactly issue lies.

Any help is greatly appreciated.

Thank You

How do I use this software?

I used OpenNMT-py/data/*.txt.
But, I could not get good results.
Someone please tell me my mistake.

Step 0: Copy data
% cp OpenNMT-py/data/src-train.txt OpenNMT-py-with-BERT/data/
% cp OpenNMT-py/data/tgt-train.txt OpenNMT-py-with-BERT/data/
% cp OpenNMT-py/data/src-val.txt OpenNMT-py-with-BERT/data/
% cp OpenNMT-py/data/tgt-val.txt OpenNMT-py-with-BERT/data/
% cp OpenNMT-py/data/src-test.txt OpenNMT-py-with-BERT/data/

% head -3 data/src-train.txt
It is not acceptable that , with the help of the national bureaucracies , Parliament 's legislative prerogative should be made null and void by means of implementing provisions whose content , purpose and extent are not laid down in advance .
Federal Master Trainer and Senior Instructor of the Italian Federation of Aerobic Fitness , Group Fitness , Postural Gym , Stretching and Pilates; from 2004 , he has been collaborating with Antiche Terme as personal Trainer and Instructor of Stretching , Pilates and Postural Gym .
" Two soldiers came up to me and told me that if I refuse to sleep with them , they will kill me . They beat me and ripped my clothes .

% head -3 data/tgt-train.txt
Es geht nicht an , dass über Ausführungsbestimmungen , deren Inhalt , Zweck und Ausmaß vorher nicht bestimmt ist , zusammen mit den nationalen Bürokratien das Gesetzgebungsrecht des Europäischen Parlaments ausgehebelt wird .
Meistertrainer und leitender Dozent des italienischen Fitnessverbands für Aerobic , Gruppenfitness , Haltungsgymnastik , Stretching und Pilates; arbeitet seit 2004 bei Antiche Terme als Personal Trainer und Lehrer für Stretching , Pilates und Rückengymnastik .
Also kam ich nach Südafrika " , erzählte eine Frau namens Grace dem Human Rights Watch-Mitarbeiter Gerry Simpson , der die Probleme der zimbabwischen Flüchtlinge in Südafrika untersucht .

Step 1: Preprocess the data
$ time python preprocess.py -train_src data/src-train.txt -train_tgt data/tgt-train.txt -valid_src data/src-val.txt -valid_tgt data/tgt-val.txt -save_data data/demo

Step 2: Train the model
$ time python -u train.py -data data/demo -save_model demo-model -train_steps 1000 -rnn_size 768 -word_vec_size 768 -encoder_type rnn -decoder_type rnn -optim adagrad -learning_rate 0.15 -valid_steps 100 -save_checkpoint_steps 100 -log_file log.txt

Step 3: Translate
$ time python translate.py -model demo-model_step_1000.pt -src data/src-test.txt -output pred.txt -replace_unk -verbose
PRED AVG SCORE: -1.1431, PRED PPL: 3.1365
python translate.py -model demo-model_step_1000.pt -src data/src-test.txt 11940.70s user 83.19s system 651% cpu 30:44.76 total"
[2020-02-03 17:34:12,496 INFO] Translating shard 0.
SENT 1: ['[CLS]', 'orlando', 'bloom', 'and', 'miranda', 'kerr', 'still', 'love', 'each', 'other', '[SEP]']
PRED 1: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
PRED SCORE: -114.3066
SENT 2: ['[CLS]', 'actors', 'orlando', 'bloom', 'and', 'model', 'miranda', 'kerr', 'want', 'to', 'go', 'their', 'separate', 'ways', '.', '[SEP]']
PRED 2: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
PRED SCORE: -114.3066
SENT 3: ['[CLS]', 'however', ',', 'in', 'an', 'interview', ',', 'bloom', 'has', 'said', 'that', 'he', 'and', 'kerr', 'still', 'love', 'each', 'other', '.', '[SEP]']
PRED 3: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
PRED SCORE: -114.3066

I got the "must be equal to input_size error" message.

I got the "must be equal to input_size error" message.
Please tell me how to deal with this problem.
I used OpenNMT-py / data.

mscOS: 10.15.2
Python: 3.7.5rc1
torch: 1.0.0
torchvision: 0.2.1

$ time python preprocess.py -train_src data/src-train.txt -train_tgt data/tgt-train.txt -valid_src data/src-val.txt -valid_tgt data/tgt-val.txt -save_data data/demo

[2020-01-12 10:29:17,834 INFO] Extracting features...
[2020-01-12 10:29:17,834 INFO] * number of source features: 0.
[2020-01-12 10:29:17,834 INFO] * number of target features: 0.
[2020-01-12 10:29:17,834 INFO] Building Fields object...
[2020-01-12 10:29:17,835 INFO] Building & saving training data...
[2020-01-12 10:29:17,835 INFO] Reading source and target files: data/src-train.txt data/tgt-train.txt.
[2020-01-12 10:29:17,838 INFO] Building shard 0.
[2020-01-12 10:29:21,220 INFO] * saving 0th train data shard to data/demo.train.0.pt.
[2020-01-12 10:29:21,640 INFO] Building & saving validation data...
[2020-01-12 10:29:21,640 INFO] Reading source and target files: data/src-val.txt data/tgt-val.txt.
[2020-01-12 10:29:21,642 INFO] Building shard 0.
[2020-01-12 10:29:22,717 INFO] * saving 0th valid data shard to data/demo.valid.0.pt.
[2020-01-12 10:29:22,857 INFO] Building & saving vocabulary...
[2020-01-12 10:29:22,935 INFO] * reloading data/demo.train.0.pt.
[2020-01-12 10:29:23,156 INFO] * tgt vocab size: 33329.
[2020-01-12 10:29:23,183 INFO] * src vocab size: 14863.

real 0m14.890s
user 0m9.505s
sys 0m0.912s

$ time python train.py -data data/demo -save_model demo-model
[2020-01-12 10:30:12,973 INFO] * src vocab size = 14863
[2020-01-12 10:30:12,973 INFO] * tgt vocab size = 33329
[2020-01-12 10:30:12,973 INFO] Building model...
[2020-01-12 10:30:13,467 INFO] NMTModel(
(encoder): RNNEncoder(
(embeddings): Embeddings(
(make_embedding): Sequential(
(emb_luts): Elementwise(
(0): Embedding(14863, 500, padding_idx=1)
)
)
)
(rnn): LSTM(500, 500, num_layers=2, dropout=0.3)
)
(decoder): InputFeedRNNDecoder(
(embeddings): Embeddings(
(make_embedding): Sequential(
(emb_luts): Elementwise(
(0): Embedding(33329, 500, padding_idx=1)
)
)
)
(dropout): Dropout(p=0.3)
(rnn): StackedLSTM(
(dropout): Dropout(p=0.3)
(layers): ModuleList(
(0): LSTMCell(1000, 500)
(1): LSTMCell(500, 500)
)
)
(attn): GlobalAttention(
(linear_in): Linear(in_features=500, out_features=500, bias=False)
(linear_out): Linear(in_features=1000, out_features=500, bias=False)
)
)
(generator): Sequential(
(0): Linear(in_features=500, out_features=33329, bias=True)
(1): Cast()
(2): LogSoftmax()
)
)
[2020-01-12 10:30:13,467 INFO] encoder: 11439500
[2020-01-12 10:30:13,467 INFO] decoder: 39120329
[2020-01-12 10:30:13,467 INFO] * number of parameters: 50559829
[2020-01-12 10:30:13,468 INFO] Starting training on CPU, could be very slow
[2020-01-12 10:30:13,468 INFO] Start training loop and validate every 10000 steps...
[2020-01-12 10:30:13,534 INFO] Loading dataset from data/demo.train.0.pt, number of examples: 9554
Traceback (most recent call last):
File "train.py", line 109, in
main(opt)
File "train.py", line 41, in main
single_main(opt, -1)
File "/Users/kami/Documents/mlearn/OpenNMT-py-with-BERT/onmt/train_single.py", line 116, in main
valid_steps=opt.valid_steps)
File "/Users/kami/Documents/mlearn/OpenNMT-py-with-BERT/onmt/trainer.py", line 209, in train
report_stats)
File "/Users/kami/Documents/mlearn/OpenNMT-py-with-BERT/onmt/trainer.py", line 318, in _gradient_accumulation
outputs, attns = self.model(src, tgt, src_lengths, bptt=bptt)
File "/Users/kami/python_env/OpenNMT-py-with-BERT/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/Users/kami/Documents/mlearn/OpenNMT-py-with-BERT/onmt/models/model.py", line 42, in forward
enc_state, memory_bank, lengths = self.encoder(src, lengths)
File "/Users/kami/python_env/OpenNMT-py-with-BERT/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/Users/kami/Documents/mlearn/OpenNMT-py-with-BERT/onmt/encoders/rnn_encoder.py", line 82, in forward
memory_bank, encoder_final = self.rnn(packed_emb)
File "/Users/kami/python_env/OpenNMT-py-with-BERT/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/Users/kami/python_env/OpenNMT-py-with-BERT/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 175, in forward
self.check_forward_args(input, hx, batch_sizes)
File "/Users/kami/python_env/OpenNMT-py-with-BERT/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 135, in check_forward_args
self.input_size, input.size(-1)))
RuntimeError: input.size(-1) must be equal to input_size. Expected 500, got 768

Error training model

I'm attempting to train the model using this command after preprocessing, as I have been successfully doing on the default OpenNMT-py. I am using the same command:
!python -u ./train.py \ -data preprocessed \ -save_model CQR_valid_100\ -train_steps 10000 \ -rnn_size 128 \ -word_vec_size 128 \ -encoder_type rnn \ -decoder_type rnn \ -optim adagrad \ -learning_rate 0.15 \ -share_embeddings \ -valid_steps 100 \ -save_checkpoint_steps 1000\ -log_file log.txt
but i keep getting this error:
Traceback (most recent call last):
File "./train.py", line 109, in
main(opt)
File "./train.py", line 41, in main
single_main(opt, -1)
File "/content/OpenNMT-py-with-BERT/onmt/train_single.py", line 116, in main
valid_steps=opt.valid_steps)
File "/content/OpenNMT-py-with-BERT/onmt/trainer.py", line 192, in train
self._accum_batches(train_iter)):
File "/content/OpenNMT-py-with-BERT/onmt/trainer.py", line 127, in _accum_batches
for batch in iterator:
File "/content/OpenNMT-py-with-BERT/onmt/inputters/inputter.py", line 598, in iter
for batch in self._iter_dataset(path):
File "/content/OpenNMT-py-with-BERT/onmt/inputters/inputter.py", line 583, in _iter_dataset
for batch in cur_iter:
File "/usr/local/lib/python3.6/dist-packages/torchtext/data/iterator.py", line 156, in iter
yield Batch(minibatch, self.dataset, self.device)
File "/usr/local/lib/python3.6/dist-packages/torchtext/data/batch.py", line 34, in init
setattr(self, name, field.process(batch, device=device))
File "/content/OpenNMT-py-with-BERT/onmt/inputters/text_dataset.py", line 297, in process
bert_embeddings = self.bertify(tensor)
File "/content/OpenNMT-py-with-BERT/onmt/inputters/text_dataset.py", line 235, in bertify
bert_embeddings = bert_model(tensor, output_all_encoded_layers = False)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/pytorch_pretrained_bert/modeling.py", line 730, in forward
embedding_output = self.embeddings(input_ids, token_type_ids)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/pytorch_pretrained_bert/modeling.py", line 267, in forward
words_embeddings = self.word_embeddings(input_ids)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/sparse.py", line 117, in forward
self.norm_type, self.scale_grad_by_freq, self.sparse)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py", line 1506, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected object of backend CUDA but got backend CPU for argument #3 'index'

I can fix the error by forcing the device in the text_dataset.py file to always be 'CPU' but this would make training very slow, and it already takes hours train even on the 'GPU'.
please help

Runtime Error in translate.py

I encountered this error while running the translate.py model.

[2020-02-20 09:55:40,623 INFO] Translating shard 0.
Traceback (most recent call last):
File "translate.py", line 48, in
main(opt)
File "translate.py", line 32, in main
attn_debug=opt.attn_debug
File "/content/drive/My Drive/Colab Notebooks/OpenNMT-py-with-BERT/onmt/translate/translator.py", line 325, in translate
batch, data.src_vocabs, attn_debug
File "/content/drive/My Drive/Colab Notebooks/OpenNMT-py-with-BERT/onmt/translate/translator.py", line 515, in translate_batch
return_attention=attn_debug or self.replace_unk)
File "/content/drive/My Drive/Colab Notebooks/OpenNMT-py-with-BERT/onmt/translate/translator.py", line 607, in _translate_batch
src, enc_states, memory_bank, src_lengths = self._run_encoder(batch)
File "/content/drive/My Drive/Colab Notebooks/OpenNMT-py-with-BERT/onmt/translate/translator.py", line 522, in _run_encoder
src, src_lengths)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/content/drive/My Drive/Colab Notebooks/OpenNMT-py-with-BERT/onmt/encoders/rnn_encoder.py", line 82, in forward
memory_bank, encoder_final = self.rnn(packed_emb)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/rnn.py", line 556, in forward
self.check_forward_args(input, hx, batch_sizes)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/rnn.py", line 508, in check_forward_args
self.check_input(input, batch_sizes)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/rnn.py", line 159, in check_input
self.input_size, input.size(-1)))
RuntimeError: input.size(-1) must be equal to input_size. Expected 500, got 768

Model used: 2-layer BiLSTM with hidden size 500 trained for 20 epochs
src: WMT'14 English-German data - newstest2015.de
Code:

python translate.py \
-model iwslt-brnn2.s131_acc_62.71_ppl_7.74_e20.pt  \
-src data/src-test.txt \
-output pred.txt \
-replace_unk -verbose \
-gpu 0

AssertionError using pretrained model at the inference time running translate.py

Hi there, I am trying to use this code to utilize openNMT pretrained model on German English using the following command:

python translate.py -model models/transformer-ende-wmt-pyOnmt/averaged-10-epoch.pt -src data/english.txt -output data/pred_eng.txt -replace_unk -verbose

However, I get the following error. Can one help on this issue?

    result = self.forward(*input, **kwargs)
  File "OpenNMT-py-with-BERT/onmt/modules/util_class.py", line 25, in forward
    assert len(self) == len(inputs_)
AssertionError

Will BERT+transformer-decoder better than tensor2tensor for text-generation?

Thank you very much.

AssertionError when training

I have been running into an error when training "assert len(self) == len(inputs_)", please assist.

error code below:

File "train.py", line 109, in
main(opt)
File "train.py", line 39, in main
single_main(opt, 0)
File "/content/gdrive/My Drive/OpenNMT-py-with-BERT/onmt/train_single.py", line 116, in main
valid_steps=opt.valid_steps)
File "/content/gdrive/My Drive/OpenNMT-py-with-BERT/onmt/trainer.py", line 209, in train
report_stats)
File "/content/gdrive/My Drive/OpenNMT-py-with-BERT/onmt/trainer.py", line 318, in gradient_accumulation
outputs, attns = self.model(src, tgt, src_lengths, bptt=bptt)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/content/gdrive/My Drive/OpenNMT-py-with-BERT/onmt/models/model.py", line 42, in forward
enc_state, memory_bank, lengths = self.encoder(src, lengths)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/content/gdrive/My Drive/OpenNMT-py-with-BERT/onmt/encoders/transformer.py", line 113, in forward
emb = self.embeddings(src)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/content/gdrive/My Drive/OpenNMT-py-with-BERT/onmt/modules/embeddings.py", line 243, in forward
source = module(source)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/content/gdrive/My Drive/OpenNMT-py-with-BERT/onmt/modules/util_class.py", line 25, in forward
assert len(self) == len(inputs)
AssertionError

How to add Bert embedding to OpenNMT seq2seq model

(OpenNMT/OpenNMT-py#2064)
(https://forum.opennmt.net/t/how-to-use-bert-embedding-into-opennmt-seq2seq-model/4430/4)

Originally I tried the seq2seq model (Glove embedding + RNN encoder-decoder + copy generator) on Text2SQL task by OpenNMT, everything works perfectly fine. I can get an accuracy of ~60% on the GeoQuery benchmark, the cross-entropy on the training set will drop to as low as 0.10, and accuracy on training will be something > 90% (token level accuracy).

When I add Bert encoder and replace the Glove embedding with the last layer output of Bert on the encoder, the model seems to learn nothing during training. The token level accuracy in training cannot reach 90%, and the cross-entropy will remain something like 0.3. During inference, the model predicts unreasonable SQL results and can barely achieve 1% accuracy on the testing set.

I have investigated this issue for quite a long time, I double-checked my optimizer, and I use different optimizer (Adam with a learning rate of 1e-3 for parameters in my LSTM part, BertAdam with a learning rate of 1e-5 for Bert part). For the encoding part, I directly copy codes from a published Github Repo.

I could not come up with any other places that my code might go wrong. Any help will be much appreciated!

Here is the training information for the original LSTM seq2seq model

Here is the training information for Bert + seq2seq model

Here is the SQL prediction for the original Seq2seq model. We can see there are variations in lengths of SQL predictions, and variations in values the model predict in each SQL

Here is what Bert + seq2seq predicts. Not only it fails to predict the long SQL (1-3) compared to the original seq2seq, it also predicts the same value over and over again (15 - 25) for different questions. This looks really weird to me. Any ideas?

shakeel608 / opennmt-py-with-bert Goto Github PK

opennmt-py-with-bert's People

Contributors

Stargazers

Forkers

opennmt-py-with-bert's Issues

Do you use Multilingual BERT? Are weights updated after training?

The principle for BERT-NMT?

Pre-training model for text summary

assert len(self) == len(inputs_) at \OpenNMT-py-with-BERT\onmt\modules\util_class.py", line 25, in forward

How do I use this software?

I got the "must be equal to input_size error" message.

Error training model

Runtime Error in translate.py

AssertionError using pretrained model at the inference time running translate.py

Will BERT+transformer-decoder better than tensor2tensor for text-generation?

AssertionError when training

How to add Bert embedding to OpenNMT seq2seq model

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent