Comments (10)
It seems that you haven't changed the following two parameters.
"stopping_criterion":"valid_en-fr_mt_bleu,10",
"validation_metrics":"valid_en-fr_mt_bleu",
In your case, if you have two abbreviated languages lang1
and lang2
, and you want to use the blue
metric (there is also accuracy, perplexity and loss) as a stopping criterion and to validate your models, you have to do that.
"stopping_criterion":"valid_lang1-lang2_mt_bleu,10",
"validation_metrics":"valid_lang1-lang2_mt_bleu",
I should mention that the framework supports multilingual translation, so lang1
and lang2
can be chosen from a larger set of languages.
from meta_xlm.
Is this okay?
from meta_xlm.
Yes, that is completely fine. I managed to train with an MLM +TLM objective. I wanted the BLUE score for evaluation as well so I had to change my language initials to match the code.
I have one question on how to use the trained model for translation. The translate.py is the one I should be using right? Does it take a tokenized and PBE text as input?
Thanks so much for your help
from meta_xlm.
No.
translate.py
is for inference.
To train an automatic machine translation model, always use train.py
by specifying the "mt_steps"
objective (see /configs/mt_template.json
)
For example if you want to translate from English (en
) to French (fr
), then "lgs": "en-fr"
and "mt_steps": "en-fr"
.
Note that the system is multilingual, so it is bidirectional for a pair of languages. So you can simultaneously train a model to translate from en
to fr
and from fr
to en
by specifying "lgs": "en-fr"
and "mt_steps": "en-fr,fr-en"
.
You can go further by translating several languages simultaneously. Let's add to our previous languages German (de
) and Italian (it
). Then you can do "lgs": "en-fr-de-it"
and "mt_steps":"..."
. In this case mt_steps (...) will be replaced by all possible combinations of your languages: en-fr,en-de,en-it,fr-en,fr-de,fr-it,de-en, etc
(it's long to specify manually when the number of languages increases).
Note that the system is multi-tasking, so you can simultaneously do clm
(causal language modeling), mlm
(mask language modeling), tml
(translation language modeling), ae
(denoising auto-encoding), bt
(online back-translation) and mt
(machine translation).
ae + bt = unsupervised mt
If you need to understand how all this works you can refer (if not already done) to these papers:
-
(ae) Extracting and Composing Robust Features with Denoising Autoencoders :
https://www.cs.toronto.edu/~larocheh/publications/icml-2008-denoising-autoencoders.pdf -
(bt) Improving Neural Machine Translation Models with Monolingual Data :
https://arxiv.org/abs/1511.06709 -
(ae, bt, mt : supervised and unsupervised mt) Phrase-Based & Neural Unsupervised Machine Translation :
https://arxiv.org/abs/1804.07755 -
(mlm) BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
https://arxiv.org/abs/1810.04805 -
(clm) XLNet: Generalized Autoregressive Pretraining for Language Understanding :
https://arxiv.org/abs/1906.08237 -
(clm) GPT/GPT-2/GPT-3
- Improving Language Understanding by Generative Pre-Training
https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf - Language Models are Few-Shot Learners :
https://arxiv.org/abs/2005.14165
- Improving Language Understanding by Generative Pre-Training
-
(mlm, tlm, clm, multi-lingual & cross-lingual mt, both supervised and unsupervised ...) Cross-lingual Language Model Pretraining :
https://arxiv.org/abs/1901.07291 -
(meta-learning) Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks :
https://arxiv.org/abs/1703.03400 -
(our paper : all this + metalearning) : On the use of linguistic similarities to improve Neural Machine Translation for African Languages
https://openreview.net/forum?id=Q5ZxoD2LqcI (the updated version will be on arxiv soon)
For another project I'm working on, I integrated a new architecture in the code, TIM (transformers with competitive ensembles of independent mechanisms: https://arxiv.org/abs/2103.00336), which can be used in place of the normal transformer.
Also, I integrated code to automatically fine-tune models on text classification tasks (GLUE, XNLI, costum task ...).
All these updates are here, I will make everything public with another paper.
from meta_xlm.
I'm trying to reproduice all this with huggingface transformer library : https://github.com/Tikquuss/lm
from meta_xlm.
Thanks a lot for your help. That is quite thorough.
I already used lm_template.json to train a language model with the parameter "mlm_steps":"...". As per your Github, this by default uses my monolingual and parallel datasets (de, en, de-en). Then I used this language model and trained using mt_template.json with parameter "mt_steps":"...". I believe that now I have an MT model for my languages, right?
Now if I want to use it on new test sets for inference, do I use the translate.py? Could you give a hint on how to use it?
Thank you so much again and I will definitely check your new project and best of luck with your future paper.
from meta_xlm.
I am trying to use the translate.py for inference. I get the following error:
Traceback (most recent call last):
File "translate.py", line 141, in
main(params)
File "translate.py", line 60, in main
logger = initialize_exp(params)
File "/user/HS301/m16265/Documents/XML-R/meta_XLM/XLM/src/utils.py", line 57, in initialize_exp
device = params.device
AttributeError: 'Namespace' object has no attribute 'device'
I am not sure what I am doing wrong. My command on the command line is as follows:
cat /user/HS301/m16265/Documents/XML-R/processed/test.en | python translate.py --exp_name mt_enfrde --model_path /user/HS301/m16265/Documents/XML-R/dump_path/mt_enfrde/demo/best-valid_de-en_mt_bleu.pth --src_lang en --tgt_lang de --output_path output
Can you help with this error please if you have any suggestions for solving it? Thank you
from meta_xlm.
Use translate_our.py instead (see https://github.com/Tikquuss/meta_XLM/blob/master/XLM/translate_our.py#L115 for how to use)
from meta_xlm.
Thanks you for your help but I get this error with translate_our.py
Traceback (most recent call last):
File "translate_our.py", line 175, in
main(params)
File "translate_our.py", line 149, in main
logger = initialize_exp(params)
File "/user/HS301/m16265/Documents/XML-R_server/meta_XLM/XLM/src/utils.py", line 57, in initialize_exp
device = params.device
AttributeError: 'Namespace' object has no attribute 'device'
I noticed that you have a device variable in the translate_our.py
https://github.com/Tikquuss/meta_XLM/blob/master/XLM/translate_our.py#L40
should I be doing something before running the translate_our.py. Thank you for your help.
from meta_xlm.
Go to line 34 of translate_our.py (before logger = initialize_exp(params)
) and set up the device, for example you can add the line of code:
params.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
If other parameters are missing just try to do the same. All parameters are well described in train.py (I encourage you to understand the code well to be able to make some adjustments yourself)
from meta_xlm.
Related Issues (4)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from meta_xlm.