Thank you for the demo notebook. I have trained my MLM+TLM model but I get this error

It seems that you haven't changed the following two parameters. <div class="snippe

No. translate.py is for inference. <p dir="au

Use translate_our.py instead (see <a href="https://github.com/Tikquuss/meta_XLM/blob/

Issue with train with mt_template.json about meta_xlm HOT 10 CLOSED

tikquuss commented on June 8, 2024

Issue with train with mt_template.json

from meta_xlm.

Comments (10)

Tikquuss commented on June 8, 2024

It seems that you haven't changed the following two parameters.

"stopping_criterion":"valid_en-fr_mt_bleu,10", 
"validation_metrics":"valid_en-fr_mt_bleu",

In your case, if you have two abbreviated languages lang1 and lang2, and you want to use the blue metric (there is also accuracy, perplexity and loss) as a stopping criterion and to validate your models, you have to do that.

"stopping_criterion":"valid_lang1-lang2_mt_bleu,10", 
"validation_metrics":"valid_lang1-lang2_mt_bleu",

I should mention that the framework supports multilingual translation, so lang1 and lang2 can be chosen from a larger set of languages.

from meta_xlm.

Tikquuss commented on June 8, 2024

Is this okay?

from meta_xlm.

sadanyh commented on June 8, 2024

Yes, that is completely fine. I managed to train with an MLM +TLM objective. I wanted the BLUE score for evaluation as well so I had to change my language initials to match the code.

I have one question on how to use the trained model for translation. The translate.py is the one I should be using right? Does it take a tokenized and PBE text as input?

Thanks so much for your help

from meta_xlm.

Tikquuss commented on June 8, 2024

No.
translate.py is for inference.

To train an automatic machine translation model, always use train.py by specifying the "mt_steps" objective (see /configs/mt_template.json)

For example if you want to translate from English (en) to French (fr), then "lgs": "en-fr" and "mt_steps": "en-fr".
Note that the system is multilingual, so it is bidirectional for a pair of languages. So you can simultaneously train a model to translate from en to fr and from fr to en by specifying "lgs": "en-fr" and "mt_steps": "en-fr,fr-en".
You can go further by translating several languages simultaneously. Let's add to our previous languages German (de) and Italian (it). Then you can do "lgs": "en-fr-de-it" and "mt_steps":"...". In this case mt_steps (...) will be replaced by all possible combinations of your languages: en-fr,en-de,en-it,fr-en,fr-de,fr-it,de-en, etc (it's long to specify manually when the number of languages increases).

Note that the system is multi-tasking, so you can simultaneously do clm (causal language modeling), mlm (mask language modeling), tml (translation language modeling), ae (denoising auto-encoding), bt (online back-translation) and mt (machine translation).

ae + bt = unsupervised mt

If you need to understand how all this works you can refer (if not already done) to these papers:

(ae) Extracting and Composing Robust Features with Denoising Autoencoders :
https://www.cs.toronto.edu/~larocheh/publications/icml-2008-denoising-autoencoders.pdf
(bt) Improving Neural Machine Translation Models with Monolingual Data :
https://arxiv.org/abs/1511.06709
(ae, bt, mt : supervised and unsupervised mt) Phrase-Based & Neural Unsupervised Machine Translation :
https://arxiv.org/abs/1804.07755
(mlm) BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
https://arxiv.org/abs/1810.04805
(clm) XLNet: Generalized Autoregressive Pretraining for Language Understanding :
https://arxiv.org/abs/1906.08237
(clm) GPT/GPT-2/GPT-3
- Improving Language Understanding by Generative Pre-Training
  https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf
- Language Models are Few-Shot Learners :
  https://arxiv.org/abs/2005.14165
(mlm, tlm, clm, multi-lingual & cross-lingual mt, both supervised and unsupervised ...) Cross-lingual Language Model Pretraining :
https://arxiv.org/abs/1901.07291
(meta-learning) Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks :
https://arxiv.org/abs/1703.03400
(our paper : all this + metalearning) : On the use of linguistic similarities to improve Neural Machine Translation for African Languages
https://openreview.net/forum?id=Q5ZxoD2LqcI (the updated version will be on arxiv soon)

For another project I'm working on, I integrated a new architecture in the code, TIM (transformers with competitive ensembles of independent mechanisms: https://arxiv.org/abs/2103.00336), which can be used in place of the normal transformer.
Also, I integrated code to automatically fine-tune models on text classification tasks (GLUE, XNLI, costum task ...).
All these updates are here, I will make everything public with another paper.

from meta_xlm.

Tikquuss commented on June 8, 2024

I'm trying to reproduice all this with huggingface transformer library : https://github.com/Tikquuss/lm

from meta_xlm.

sadanyh commented on June 8, 2024

Thanks a lot for your help. That is quite thorough.

I already used lm_template.json to train a language model with the parameter "mlm_steps":"...". As per your Github, this by default uses my monolingual and parallel datasets (de, en, de-en). Then I used this language model and trained using mt_template.json with parameter "mt_steps":"...". I believe that now I have an MT model for my languages, right?

Now if I want to use it on new test sets for inference, do I use the translate.py? Could you give a hint on how to use it?

Thank you so much again and I will definitely check your new project and best of luck with your future paper.

from meta_xlm.

sadanyh commented on June 8, 2024

I am trying to use the translate.py for inference. I get the following error:
Traceback (most recent call last):
File "translate.py", line 141, in
main(params)
File "translate.py", line 60, in main
logger = initialize_exp(params)
File "/user/HS301/m16265/Documents/XML-R/meta_XLM/XLM/src/utils.py", line 57, in initialize_exp
device = params.device
AttributeError: 'Namespace' object has no attribute 'device'

I am not sure what I am doing wrong. My command on the command line is as follows:

cat /user/HS301/m16265/Documents/XML-R/processed/test.en | python translate.py --exp_name mt_enfrde --model_path /user/HS301/m16265/Documents/XML-R/dump_path/mt_enfrde/demo/best-valid_de-en_mt_bleu.pth --src_lang en --tgt_lang de --output_path output

Can you help with this error please if you have any suggestions for solving it? Thank you

from meta_xlm.

Tikquuss commented on June 8, 2024

Use translate_our.py instead (see https://github.com/Tikquuss/meta_XLM/blob/master/XLM/translate_our.py#L115 for how to use)

from meta_xlm.

sadanyh commented on June 8, 2024

Thanks you for your help but I get this error with translate_our.py

Traceback (most recent call last):
File "translate_our.py", line 175, in
main(params)
File "translate_our.py", line 149, in main
logger = initialize_exp(params)
File "/user/HS301/m16265/Documents/XML-R_server/meta_XLM/XLM/src/utils.py", line 57, in initialize_exp
device = params.device
AttributeError: 'Namespace' object has no attribute 'device'

I noticed that you have a device variable in the translate_our.py
https://github.com/Tikquuss/meta_XLM/blob/master/XLM/translate_our.py#L40

should I be doing something before running the translate_our.py. Thank you for your help.

from meta_xlm.

Tikquuss commented on June 8, 2024

Go to line 34 of translate_our.py (before logger = initialize_exp(params)) and set up the device, for example you can add the line of code:

params.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

If other parameters are missing just try to do the same. All parameters are well described in train.py (I encourage you to understand the code well to be able to make some adjustments yourself)

from meta_xlm.

Issue with train with mt_template.json about meta_xlm HOT 10 CLOSED

Comments (10)

Related Issues (4)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent