Git Product home page Git Product logo

Comments (10)

Tikquuss avatar Tikquuss commented on June 8, 2024

It seems that you haven't changed the following two parameters.

"stopping_criterion":"valid_en-fr_mt_bleu,10", 
"validation_metrics":"valid_en-fr_mt_bleu",

In your case, if you have two abbreviated languages lang1 and lang2, and you want to use the blue metric (there is also accuracy, perplexity and loss) as a stopping criterion and to validate your models, you have to do that.

"stopping_criterion":"valid_lang1-lang2_mt_bleu,10", 
"validation_metrics":"valid_lang1-lang2_mt_bleu",

I should mention that the framework supports multilingual translation, so lang1 and lang2 can be chosen from a larger set of languages.

1

from meta_xlm.

Tikquuss avatar Tikquuss commented on June 8, 2024

Is this okay?

from meta_xlm.

sadanyh avatar sadanyh commented on June 8, 2024

Yes, that is completely fine. I managed to train with an MLM +TLM objective. I wanted the BLUE score for evaluation as well so I had to change my language initials to match the code.

I have one question on how to use the trained model for translation. The translate.py is the one I should be using right? Does it take a tokenized and PBE text as input?

Thanks so much for your help

from meta_xlm.

Tikquuss avatar Tikquuss commented on June 8, 2024

No.
translate.py is for inference.

To train an automatic machine translation model, always use train.py by specifying the "mt_steps" objective (see /configs/mt_template.json)

For example if you want to translate from English (en) to French (fr), then "lgs": "en-fr" and "mt_steps": "en-fr".
Note that the system is multilingual, so it is bidirectional for a pair of languages. So you can simultaneously train a model to translate from en to fr and from fr to en by specifying "lgs": "en-fr" and "mt_steps": "en-fr,fr-en".
You can go further by translating several languages simultaneously. Let's add to our previous languages German (de) and Italian (it). Then you can do "lgs": "en-fr-de-it" and "mt_steps":"...". In this case mt_steps (...) will be replaced by all possible combinations of your languages: en-fr,en-de,en-it,fr-en,fr-de,fr-it,de-en, etc (it's long to specify manually when the number of languages increases).

Note that the system is multi-tasking, so you can simultaneously do clm (causal language modeling), mlm (mask language modeling), tml (translation language modeling), ae (denoising auto-encoding), bt (online back-translation) and mt (machine translation).

ae + bt = unsupervised mt

If you need to understand how all this works you can refer (if not already done) to these papers:

For another project I'm working on, I integrated a new architecture in the code, TIM (transformers with competitive ensembles of independent mechanisms: https://arxiv.org/abs/2103.00336), which can be used in place of the normal transformer.
Also, I integrated code to automatically fine-tune models on text classification tasks (GLUE, XNLI, costum task ...).
All these updates are here, I will make everything public with another paper.

from meta_xlm.

Tikquuss avatar Tikquuss commented on June 8, 2024

I'm trying to reproduice all this with huggingface transformer library : https://github.com/Tikquuss/lm

from meta_xlm.

sadanyh avatar sadanyh commented on June 8, 2024

Thanks a lot for your help. That is quite thorough.

I already used lm_template.json to train a language model with the parameter "mlm_steps":"...". As per your Github, this by default uses my monolingual and parallel datasets (de, en, de-en). Then I used this language model and trained using mt_template.json with parameter "mt_steps":"...". I believe that now I have an MT model for my languages, right?

Now if I want to use it on new test sets for inference, do I use the translate.py? Could you give a hint on how to use it?

Thank you so much again and I will definitely check your new project and best of luck with your future paper.

from meta_xlm.

sadanyh avatar sadanyh commented on June 8, 2024

I am trying to use the translate.py for inference. I get the following error:
Traceback (most recent call last):
File "translate.py", line 141, in
main(params)
File "translate.py", line 60, in main
logger = initialize_exp(params)
File "/user/HS301/m16265/Documents/XML-R/meta_XLM/XLM/src/utils.py", line 57, in initialize_exp
device = params.device
AttributeError: 'Namespace' object has no attribute 'device'

I am not sure what I am doing wrong. My command on the command line is as follows:

cat /user/HS301/m16265/Documents/XML-R/processed/test.en | python translate.py --exp_name mt_enfrde --model_path /user/HS301/m16265/Documents/XML-R/dump_path/mt_enfrde/demo/best-valid_de-en_mt_bleu.pth --src_lang en --tgt_lang de --output_path output

Can you help with this error please if you have any suggestions for solving it? Thank you

from meta_xlm.

Tikquuss avatar Tikquuss commented on June 8, 2024

Use translate_our.py instead (see https://github.com/Tikquuss/meta_XLM/blob/master/XLM/translate_our.py#L115 for how to use)

from meta_xlm.

sadanyh avatar sadanyh commented on June 8, 2024

Thanks you for your help but I get this error with translate_our.py

Traceback (most recent call last):
File "translate_our.py", line 175, in
main(params)
File "translate_our.py", line 149, in main
logger = initialize_exp(params)
File "/user/HS301/m16265/Documents/XML-R_server/meta_XLM/XLM/src/utils.py", line 57, in initialize_exp
device = params.device
AttributeError: 'Namespace' object has no attribute 'device'

I noticed that you have a device variable in the translate_our.py
https://github.com/Tikquuss/meta_XLM/blob/master/XLM/translate_our.py#L40

should I be doing something before running the translate_our.py. Thank you for your help.

from meta_xlm.

Tikquuss avatar Tikquuss commented on June 8, 2024

Go to line 34 of translate_our.py (before logger = initialize_exp(params)) and set up the device, for example you can add the line of code:

params.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

If other parameters are missing just try to do the same. All parameters are well described in train.py (I encourage you to understand the code well to be able to make some adjustments yourself)

from meta_xlm.

Related Issues (4)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.