raymondhs / constrained-levt Goto Github PK

Lexically Constrained Neural Machine Translation with Levenshtein Transformer

License: MIT License

Shell 0.18% Python 96.59% C++ 0.98% Cuda 1.94% Lua 0.32%

constrained-levt's Introduction

Constrained-LevT

This repository contains the code for the ACL-20 paper: Lexically Constrained Neural Machine Translation with Levenshtein Transformer. If you use this repository in your work, please cite:

@article{susanto2020lexically,
  title={Lexically Constrained Neural Machine Translation with Levenshtein Transformer},
  author={Susanto, Raymond Hendy and Chollampatt, Shamil and Tan, Liling},
  journal={arXiv preprint arXiv:2004.12681},
  year={2020}
}

Requirements and Installation

PyTorch version >= 1.2.0
Python version >= 3.6

git clone https://github.com/raymondhs/constrained-levt
cd constrained-levt
pip install --editable .

Usage

To replicate the experiments in our paper, you can download our pretrained models and evaluation sets into the root directory of this repository. These models were trained following the original instructions to train Levenshtein Transformer model. To preserve each constraint in the output, use --preserve-constraint. For example:

mkdir -p data-bin
tar -xvzf const_levt_en_de.tgz -C data-bin
cat data-bin/const_levt_en_de/newstest2014-wikt.en \
| python interactive_with_constraints.py \
    data-bin/const_levt_en_de \
    -s en -t de \
    --task translation_lev \
    --path data-bin/const_levt_en_de/checkpoint_best.pt \
    --iter-decode-max-iter 9 \
    --iter-decode-eos-penalty 0 \
    --beam 1 \
    --print-step \
    --batch-size 400 \
    --buffer-size 4000 \
    --preserve-constraint | tee /tmp/gen.out
# ...
# | Translated 3003 sentences (87040 tokens) in 11.5s (261.37 sentences/s, 7575.50 tokens/s)

# Compute term usage rate
cat /tmp/gen.out \
| grep ^H \
| sed 's/^H\-//' \
| sort -n -k 1 \
| cut -f 3 > /tmp/gen.out.sys
python scripts/term_usage_rate.py \
    -i data-bin/const_levt_en_de/newstest2014-wikt.en \
    -s /tmp/gen.out.sys
# Term use rate: 100.000

Each input line is tab-separated, where the first column corresponds to the source text and the remaining columns for the constraints. Each constraint is provided in this format: source|||target. A preprocessing script (tokenize.sh) is provided in case you want to try with your own input. It will run tokenization, BPE segmentation, and additional preprocessing for Romanian. For example:

echo 'Hello world!' | ./tokenize.sh en data-bin/const_levt_en_de/ende.code

License

The code and models in this repository are licensed under the MIT License. The evaluation datasets are licensed under CC-BY-SA 3.0.

constrained-levt's People

Contributors

Stargazers

Watchers

Forkers

michaelzhouwang tangpeng19 hfxunlp ramoramainteractive pangjh3 jus1mple

constrained-levt's Issues

Disable Deletion and Insertion

If I want to enable the Deletion and Insertion, I just have to leave out the parameter --preserve-constraint, right?

How to specify constraints?

Hi, I would like to employ the method in my own dataset. But I cannot figure out how to spe. cify constraints. In the readme, the input newstest2014-wikt.en does not seems to contain any constraints. Thanks in advance.

Could you please provide original dataset and dict in your experiment?

Hi, I tried to replicte your experiment, it worked, but I want to compare it with original data in BLEU score, so I want to use the original training set to generate data using fairseq, could you provide it? Thank you!

Incompatible with current fairseq version?

Hi, I trained a Levenshtein Transformer NMT model for German to English according to the instructions by fairseq and now I'm trying to use your code to generate translations with constraints but I get errors. I saw you're using fairseq version 0.8.0 so I thought it might be some problem with incompatible versions but I tried training with versions 0.10.0 and 0.9.0 too and still get errors. Version 0.8.0 had no translation_lev task at all so that didn't work either. What am I missing?
This is the command I used for training:

fairseq-train data-bin/prepared_data \
    --save-dir checkpoints \
    --ddp-backend=legacy_ddp \
    --task translation_lev \
    --criterion nat_loss \
    --arch levenshtein_transformer \
    --noise random_delete \
    --share-all-embeddings \
    --optimizer adam --adam-betas '(0.9,0.98)' \
    --lr 0.0002 --lr-scheduler reduce_lr_on_plateau \
    --stop-min-lr '1e-09' --warmup-updates 10000 \
    --warmup-init-lr '1e-07' --label-smoothing 0.1 \
    --dropout 0.3 --weight-decay 0.01 \
    --decoder-learned-pos \
    --encoder-learned-pos \
    --apply-bert-init \
    --log-format 'simple' --log-interval 50 \
    --log-file log \
    --fixed-validation-seed 7 \
    --max-tokens 2048 \
    --save-interval-updates 4000 \
    --max-update 300000 \
    --patience 4 \
    --skip-invalid-size-inputs-valid-test

This is the command I'm using for generation:

python interactive_with_constraints.py \
    data-bin/prepared_data \
    -s de -t en \
    --input data/test_three.de \
    --task translation_lev \
    --path checkpoints/checkpoint_best.pt \
    --iter-decode-max-iter 9 \
    --iter-decode-eos-penalty 0 \
    --beam 1 \
    --print-step \
    --batch-size 400 \
    --buffer-size 4000 \
    --preserve-constraint

These are the error tracebacks:
With version 0.10.2 (master):

Namespace(allow_insertion_constraint=False, beam=1, bpe=None, buffer_size=4000, cpu=False, criterion='cross_entropy', data='/content/drive/MyDrive/susanto_model/data-bin/prepared_data', dataset_impl=None, decoding_format=None, diverse_beam_groups=-1, diverse_beam_strength=0.5, empty_cache_freq=0, force_anneal=None, fp16=False, fp16_init_scale=128, fp16_scale_tolerance=0.0, fp16_scale_window=None, gen_subset='test', input='/content/drive/MyDrive/susanto_model/data/test_three.de', iter_decode_eos_penalty=0.0, iter_decode_force_max_iter=False, iter_decode_max_iter=9, lazy_load=False, left_pad_source='True', left_pad_target='False', lenpen=1, load_alignments=False, log_format=None, log_interval=1000, lr_scheduler='fixed', lr_shrink=0.1, match_source_len=False, max_len_a=0, max_len_b=200, max_sentences=400, max_source_positions=1024, max_target_positions=1024, max_tokens=None, memory_efficient_fp16=False, min_len=1, min_loss_scale=0.0001, model_overrides='{}', momentum=0.99, nbest=1, no_beamable_mm=False, no_early_stop=False, no_progress_bar=False, no_repeat_ngram_size=0, noise='random_delete', num_shards=1, num_workers=1, optimizer='nag', path='/content/drive/MyDrive/susanto_model/checkpoints_susanto/checkpoint_best.pt', prefix_size=0, preserve_constraint=True, print_alignment=False, print_step=True, quiet=False, raw_text=False, remove_bpe=None, replace_unk=None, required_batch_size_multiple=8, results_path=None, sacrebleu=False, sampling=False, sampling_topk=-1, sampling_topp=-1.0, score_reference=False, seed=1, shard_id=0, skip_invalid_size_inputs_valid_test=False, source_lang='de', target_lang='en', task='translation_lev', tbmf_wrapper=False, temperature=1.0, tensorboard_logdir='', threshold_loss_scale=None, tokenizer=None, unkpen=0, unnormalized=False, upsample_primary=1, user_dir=None, warmup_updates=0, weight_decay=0.0)
| [de] dictionary: 8544 types
| [en] dictionary: 8544 types
| loading model(s) from checkpoints/checkpoint_best.pt

Traceback (most recent call last):
  File "interactive_with_constraints.py", line 234, in <module>
    cli_main()
  File "interactive_with_constraints.py", line 230, in cli_main
    main(args)
  File "interactive_with_constraints.py", line 101, in main
    task=task,
  File "/content/constrained-levt/fairseq/checkpoint_utils.py", line 167, in load_model_ensemble
    ensemble, args, _task = load_model_ensemble_and_task(filenames, arg_overrides, task)
  File "/content/constrained-levt/fairseq/checkpoint_utils.py", line 178, in load_model_ensemble_and_task
    state = load_checkpoint_to_cpu(filename, arg_overrides)
  File "/content/constrained-levt/fairseq/checkpoint_utils.py", line 154, in load_checkpoint_to_cpu
    state = _upgrade_state_dict(state)
  File "/content/constrained-levt/fairseq/checkpoint_utils.py", line 323, in _upgrade_state_dict
    state['args'].task = 'translation'
AttributeError: 'NoneType' object has no attribute 'task'

With versions 0.10.0 and 0.9.0:

Traceback (most recent call last):
  File "interactive_with_constraints.py", line 234, in <module>
    cli_main()
  File "interactive_with_constraints.py", line 230, in cli_main
    main(args)
  File "interactive_with_constraints.py", line 101, in main
    task=task,
  File "/content/constrained-levt/fairseq/checkpoint_utils.py", line 167, in load_model_ensemble
    ensemble, args, _task = load_model_ensemble_and_task(filenames, arg_overrides, task)
  File "/content/constrained-levt/fairseq/checkpoint_utils.py", line 186, in load_model_ensemble_and_task
    model.load_state_dict(state['model'], strict=True)
  File "/content/constrained-levt/fairseq/models/fairseq_model.py", line 69, in load_state_dict
    return super().load_state_dict(state_dict, strict)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1407, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for LevenshteinTransformerModel:
	Missing key(s) in state_dict: "encoder.layers.0.self_attn.in_proj_weight", "encoder.layers.0.self_attn.in_proj_bias", "encoder.layers.1.self_attn.in_proj_weight", "encoder.layers.1.self_attn.in_proj_bias", "encoder.layers.2.self_attn.in_proj_weight", [...], "decoder.layers.5.encoder_attn.in_proj_bias". 
	Unexpected key(s) in state_dict: "encoder.layers.0.self_attn.k_proj.weight", "encoder.layers.0.self_attn.k_proj.bias", "encoder.layers.0.self_attn.v_proj.weight", "encoder.layers.0.self_attn.v_proj.bias", "encoder.layers.0.self_attn.q_proj.weight", "encoder.layers.0.self_attn.q_proj.bias", "encoder.layers.1.self_attn.k_proj.weight", "encoder.layers.1.self_attn.k_proj.bias", "encoder.layers.1.self_attn.v_proj.weight", "encoder.layers.1.self_attn.v_proj.bias", "encoder.layers.1.self_attn.q_proj.weight", "encoder.layers.1.self_attn.q_proj.bias",    [...]    "decoder.layers.5.encoder_attn.v_proj.bias", "decoder.layers.5.encoder_attn.q_proj.weight", "decoder.layers.5.encoder_attn.q_proj.bias".

Question in replicate experiment

Hello, I tried to follow the README to replicate your experiments, but when I run python interactive_with_constraints.py, it was blocked, when I check GPU states I found the program was running in GPU, I think it should be a fast process, how to solve it?
My environment is Python 3.6, Pytorch 1.4.0, CUDA version is 10.1 and driver version is 430.64.
Hope for your answer.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.