elbayadm / attn2d Goto Github PK

View Code? Open in Web Editor NEW

500.0 500.0 73.0 6.36 MB

Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

License: MIT License

Python 96.96% Shell 0.05% C++ 0.84% Cuda 1.93% Lua 0.22%

fairseq neural-machine-translation nlp nmt pytorch

attn2d's People

Contributors

Stargazers

Watchers

attn2d's Issues

AttributeError: 'float' object has no attribute 'data' on pytorch 0.4.1

I get the following error at trainer.backward_step() on running the demo script:
File "attn2d/nmt/trainer.py", line 230, in backward_step
self.clip_norm).data.item()
AttributeError: 'float' object has no attribute 'data'

which I fixed by:
grad_norm = torch.nn.utils.clip_grad_norm_(self.model.parameters(), self.clip_norm)

Which then runs OK for torch==0.4.1

Which version of pytorch was the code written for?

ModuleNotFoundError: No module named 'nmt.loader.pair_loader'

Hi, as title says, I got this error:

Traceback (most recent call last):
File "/home//repos/attn2d/train.py", line 84, in
train(params)
File "/home//repos/attn2d/train.py", line 20, in train
from nmt.trainer import Trainer
File "/home//repos/attn2d/nmt/trainer.py", line 17, in
from nmt.loader.pair_loader import DataPair
ModuleNotFoundError: No module named 'nmt.loader.pair_loader'

No module named 'nmt.loader.pair_loader'

As the title, lack of a module named pair_loader, please double check.

ModuleNotFoundError: No module named 'examples.simultaneous'

🐛 Bug

Hi,

I was trying to evaluate the pre-trained models under "Efficient Wait-k Models for Simultaneous Machine Translation". For this, I followed the instructions given in the readme. Specifically, I did followings:

After downloading model and data and placing them under pre_saved:

cd ~/attn2d/pre_saved
tar xzf iwslt14_de_en.tar.gz
tar xzf tf_waitk_model.tar.gz

k=5 # Evaluation time k
output=wait$k.log
CUDA_VISIBLE_DEVICES=0 python generate.py pre_saved/iwslt14_deen_bpe10k_binaries/ -s de -t en --gen-subset test --path pre_saved/tf_waitk_model.tar.gz --task waitk_translation --eval-waitk $k --model-overrides "{'max_source_positions': 1024, 'max_target_positions': 1024}" --left-pad-source False --user-dir examples/waitk --no-progress-bar --max-tokens 8000 --remove-bpe --beam 1 2>&1 | tee -a $output

It generates following error message:

Traceback (most recent call last):
  File "generate.py", line 11, in <module>
    cli_main()
  File "/home/attn2d/fairseq_cli/generate.py", line 276, in cli_main
    parser = options.get_generation_parser()
  File "/home/attn2d/fairseq/options.py", line 33, in get_generation_parser
    parser = get_parser("Generation", default_task)
  File "/home/attn2d/fairseq/options.py", line 197, in get_parser
    utils.import_user_module(usr_args)
  File "/home/attn2d/fairseq/utils.py", line 350, in import_user_module
    importlib.import_module(module_name)
  File "/home/anaconda3/envs/py37/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/attn2d/examples/waitk/__init__.py", line 1, in <module>
    from . import models, tasks
  File "/home/attn2d/examples/waitk/models/__init__.py", line 7, in <module>
    importlib.import_module('examples.simultaneous.models.' + model_name)
  File "/home/anaconda3/envs/py37/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named 'examples.simultaneous'

EDIT

Okay, here is more detail about this:

I believe thisline is responsible for the error message shared above.

I changed this importlib.import_module('examples.simultaneous.models.' + model_name) to importlib.import_module('examples.waitk.models.' + model_name)

Then, I got another error:

  File "generate.py", line 11, in <module>
    cli_main()
  File "/home/attn2d/fairseq_cli/generate.py", line 276, in cli_main
    parser = options.get_generation_parser()
  File "/home/attn2d/fairseq/options.py", line 33, in get_generation_parser
    parser = get_parser("Generation", default_task)
  File "/home/attn2d/fairseq/options.py", line 197, in get_parser
    utils.import_user_module(usr_args)
  File "/home/attn2d/fairseq/utils.py", line 350, in import_user_module
    importlib.import_module(module_name)
  File "/home/anaconda3/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/attn2d/examples/waitk/__init__.py", line 1, in <module>
    from . import models, tasks
  File "/home/attn2d/examples/waitk/models/__init__.py", line 8, in <module>
    importlib.import_module('examples.waitk.models.' + model_name)
  File "/home/anaconda3/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/home/attn2d/examples/waitk/__init__.py", line 1, in <module>
    from . import models, tasks
  File "/home/attn2d/examples/waitk/models/__init__.py", line 8, in <module>
    importlib.import_module('examples.waitk.models.' + model_name)
  File "/home/anaconda3/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/home/attn2d/examples/waitk/models/waitk_transformer.py", line 24, in <module>
    from examples.simultaneous.modules import TransformerEncoderLayer, TransformerDecoderLayer

So, I changed this line here to ```from examples.waitk.modules import TransformerEncoderLayer, ```` too. Then when I tried once more, I got the following error:

  File "generate.py", line 11, in <module>
    cli_main()
  File "/home/attn2d/fairseq_cli/generate.py", line 276, in cli_main
    parser = options.get_generation_parser()
  File "/home/attn2d/fairseq/options.py", line 33, in get_generation_parser
    parser = get_parser("Generation", default_task)
  File "/home/attn2d/fairseq/options.py", line 197, in get_parser
    utils.import_user_module(usr_args)
  File "/home/attn2d/fairseq/utils.py", line 350, in import_user_module
    importlib.import_module(module_name)
  File "/home/anaconda3/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/attn2d/examples/waitk/__init__.py", line 1, in <module>
    from . import models, tasks
  File "/home/attn2d/examples/waitk/models/__init__.py", line 8, in <module>
    importlib.import_module('examples.waitk.models.' + model_name)
  File "/home/anaconda3/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/home/attn2d/examples/waitk/__init__.py", line 1, in <module>
    from . import models, tasks
  File "/home/attn2d/examples/waitk/models/__init__.py", line 8, in <module>
    importlib.import_module('examples.waitk.models.' + model_name)
  File "/home/anaconda3/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/home/attn2d/examples/waitk/models/waitk_transformer.py", line 25, in <module>
    from examples.waitk.modules import TransformerEncoderLayer, TransformerDecoderLayer
  File "/home/attn2d/examples/waitk/modules/__init__.py", line 2, in <module>
    from .controller import Controller

So, to fix it I commented out following lines in examples/waitk/modules/init.py:

from .controller import Controller
from .branch_controller import BranchController
from .oracle import SimulTransOracleDP, SimulTransOracleDP1

Next, I've tried to use the generation command given in the readme once more..

CUDA_VISIBLE_DEVICES=0 python generate.py pretrained-sources/iwslt14_deen_bpe10k_binaries/ -s de -t en --gen-subset test --path pretrained-sources/model.pt --task waitk_translation --eval-waitk $k --model-overrides "{'max_source_positions': 1024, 'max_target_positions': 1024}" --left-pad-source False --user-dir examples/waitk --no-progress-bar --max-tokens 8000 --remove-bpe --beam 1 2>&1 | tee -a $output

I got this error:


2021-09-20 20:29:46 | INFO | fairseq_cli.generate | Namespace(all_gather_list_size=16384, beam=1, bpe=None, checkpoint_suffix='', cpu=False, criterion='cross_entropy', data='pretrained-sources/iwslt14_deen_bpe10k_binaries/', data_buffer_size=0, dataset_impl=None, decoding_format=None, diverse_beam_groups=-1, diverse_beam_strength=0.5, diversity_rate=-1.0, empty_cache_freq=0, eval_bleu=False, eval_bleu_args=None, eval_bleu_detok='space', eval_bleu_detok_args=None, eval_bleu_print_samples=False, eval_bleu_remove_bpe=None, eval_tokenized_bleu=False, eval_waitk=5, force_anneal=None, fp16=False, fp16_init_scale=128, fp16_no_flatten_grads=False, fp16_scale_tolerance=0.0, fp16_scale_window=None, gen_subset='test', iter_decode_eos_penalty=0.0, iter_decode_force_max_iter=False, iter_decode_max_iter=10, iter_decode_with_beam=1, iter_decode_with_external_reranker=False, left_pad_source='False', left_pad_target='False', lenpen=1, load_alignments=False, log_format=None, log_interval=100, lr_scheduler='fixed', lr_shrink=0.1, match_source_len=False, max_len_a=0, max_len_b=200, max_sentences=None, max_source_positions=1024, max_target_positions=1024, max_tokens=8000, memory_efficient_fp16=False, min_len=1, min_loss_scale=0.0001, model_overrides="{'max_source_positions': 1024, 'max_target_positions': 1024}", model_parallel_size=1, momentum=0.99, nbest=1, no_beamable_mm=False, no_early_stop=False, no_progress_bar=True, no_repeat_ngram_size=0, num_shards=1, num_workers=1, optimizer='nag', path='pretrained-sources/model.pt', prefix_size=0, print_alignment=False, print_step=False, quantization_config_path=None, quiet=False, remove_bpe='@@ ', replace_unk=None, required_batch_size_multiple=8, results_path=None, retain_iter_history=False, sacrebleu=False, sampling=False, sampling_topk=-1, sampling_topp=-1.0, score_reference=False, seed=1, shard_id=0, skip_invalid_size_inputs_valid_test=False, source_lang='de', target_lang='en', task='waitk_translation', temperature=1.0, tensorboard_logdir='', threshold_loss_scale=None, tokenizer=None, truncate_source=False, unkpen=0, unnormalized=False, upsample_primary=1, user_dir='examples/waitk', warmup_updates=0, weight_decay=0.0)
2021-09-20 20:29:46 | INFO | fairseq.tasks.translation | [de] dictionary: 8848 types
2021-09-20 20:29:46 | INFO | fairseq.tasks.translation | [en] dictionary: 6632 types
2021-09-20 20:29:46 | INFO | fairseq.data.data_utils | loaded 6750 examples from: pretrained-sources/iwslt14_deen_bpe10k_binaries/test.de-en.de
2021-09-20 20:29:46 | INFO | fairseq.data.data_utils | loaded 6750 examples from: pretrained-sources/iwslt14_deen_bpe10k_binaries/test.de-en.en
2021-09-20 20:29:46 | INFO | fairseq.tasks.translation | pretrained-sources/iwslt14_deen_bpe10k_binaries/ test de-en 6750 examples
2021-09-20 20:29:46 | INFO | fairseq_cli.generate | loading model(s) from pretrained-sources/model.pt
Traceback (most recent call last):
  File "generate.py", line 11, in <module>
    cli_main()
  File "/home/attn2d/fairseq_cli/generate.py", line 278, in cli_main
    main(args)
  File "/home/attn2d/fairseq_cli/generate.py", line 36, in main
    return _main(args, sys.stdout)
  File "/home/attn2d/fairseq_cli/generate.py", line 103, in _main
    num_workers=args.num_workers,
  File "/home/attn2d/fairseq/tasks/fairseq_task.py", line 181, in get_batch_iterator
    required_batch_size_multiple=required_batch_size_multiple,
  File "/home/attn2d/fairseq/data/data_utils.py", line 220, in batch_by_size
    from fairseq.data.data_utils_fast import batch_by_size_fast
  File "fairseq/data/data_utils_fast.pyx", line 1, in init fairseq.data.data_utils_fast
    # cython: language_level=3
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject
##########

I just gave up after that.. @elbayadm I hope you can help me on this.

Code sample

Environment

I have followed the instructions in the README to install my environment. :

git clone https://github.com/elbayadm/attn2d
cd attn2d
pip install --editable .

As a result, I have the following libraries in my environment:

Python 3.7.10 | packaged by conda-forge | (default, Feb 19 2021, 16:07:37) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
torc>>> torch.__version__
'1.9.0+cu102'
>>> import fairseq
>>> fairseq.__version__
'0.9.0'
>>> 
$ python --version
Python 3.7.10

Operating system: Linux

The training speed of pervasive attention model

Hello,

I am trying to run the pervasive attention model recipes as recommended in the README:
https://github.com/elbayadm/attn2d/blob/master/examples/pervasive/README.md

Based on my observation, the model is too slow with almost 30 words per second on a single GPU 1080! May you please give me a rough idea of what should be the speed expectations based on your experiments?

Thanks for your insights
Parnia

16 undefined names

Replace ‘false’ with ‘False’
Missing import tensorflow as tf
Missing import numpy as np
See #2

flake8 testing of https://github.com/elbayadm/attn2d on Python 3.7.0

$ flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics

./nmt/scheduler.py:53:44: F821 undefined name 'exp'
                                           exp(self._iter / self.speed))
                                           ^
./nmt/optimizer.py:202:16: F821 undefined name 'nn'
        return nn.utils.clip_grad_norm_(params, self.grad_norm_max, self.grad_norm_type)
               ^
./nmt/models/pooling.py:145:15: F821 undefined name 'PositionalPooling4'
        super(PositionalPooling4, self).__init__()
              ^
./nmt/models/convs2s2D.py:523:20: F821 undefined name 'trg_emb'
                if trg_emb.size(1) > max_h:
                   ^
./nmt/models/convs2s2D.py:524:31: F821 undefined name 'trg_emb'
                    trg_emb = trg_emb[:, -max_h:, :, :]
                              ^
./nmt/models/convs2s2D.py:624:41: F821 undefined name 'trg_labels'
                trg_labels = torch.cat((trg_labels, trg_labels_t),
                                        ^
./nmt/utils/logging.py:17:23: F821 undefined name 'DETOK'
    source = " ".join(DETOK.detokenize(source.split())).encode('utf-8')
                      ^
./nmt/utils/logging.py:18:19: F821 undefined name 'DETOK'
    gt = " ".join(DETOK.detokenize(gt.split())).encode('utf-8')
                  ^
./nmt/utils/logging.py:19:21: F821 undefined name 'DETOK'
    pred = " ".join(DETOK.detokenize(pred.split())).encode('utf-8')
                    ^
./nmt/utils/logging.py:129:16: F821 undefined name 'tf'
    _summary = tf.summary.scalar(name=key,
               ^
./nmt/utils/logging.py:130:41: F821 undefined name 'tf'
                                 tensor=tf.Variable(value),
                                        ^
./nmt/utils/logging.py:132:15: F821 undefined name 'tf'
    summary = tf.Summary(value=[tf.Summary.Value(tag=key, simple_value=value)])
              ^
./nmt/utils/logging.py:132:33: F821 undefined name 'tf'
    summary = tf.Summary(value=[tf.Summary.Value(tag=key, simple_value=value)])
                                ^
./nmt/loss/working_loss.py:565:31: F821 undefined name 'false'
            p.requires_grad = false
                              ^
./nmt/loss/_loss.py:402:31: F821 undefined name 'false'
            p.requires_grad = false
                              ^
./nmt/loss/samplers/ngram.py:72:52: F821 undefined name 'score'
        return preds_matrix, np.ones(batch_size) * score, stats
                                                   ^
16    F821 undefined name 'false'
16

How to use the Simuleval tool to evaluate the model?

Thanks for your work.
How to use the Simuleval tool to evaluate model performance after model training?
Sincerely look forward to hearing from you!

Is this a typo?

at attn2d/nmt/models/pooling.py
line 145.

class PositionalPooling(nn.Module):
    def __init__(self, max_length, emb_size):
        super(PositionalPooling4, self).__init__()
        self.src_embedding = nn.Embedding(max_length, emb_size)
        self.trg_embedding = nn.Embedding(max_length, emb_size)
        self.src_embedding.weight.data.fill_(1)
        self.trg_embedding.weight.data.fill_(1)
        self.src_embedding.bias.data.fill_(0)
        self.trg_embedding.bias.data.fill_(0)

Is PositionalPooling4 PositionalPooling?

Training/Eval Error for waitk model

🐛 Bug

I'am trying to run the trainning code follow the waitk guide file , and fixed some bug just as @ereday this issue mentioned , but still got error when i ran the train code :

RuntimeError: Output 0 of SplitBackward0 is a view and is being modified inplace. This view is the output of a function that returns multiple views. Such functions do not allow the output views to be modified inplace. You should replace the inplace operation by an out-of-place one.

To Reproduce

Steps to reproduce the behavior (always include the command you ran):

k=7
MODEL=tf_wait${k}_wmt14Ende
CUDA_VISIBLE_DEVICES=0 python train.py $DATA_BIN -s en -t de --left-pad-source False \
    --user-dir examples/waitk --arch waitk_transformer_small \
    --save-dir $Workdir/checkpoints/$MODEL --tensorboard-logdir $Workdir/logs/$MODEL \
    --seed 1 --no-epoch-checkpoints --no-progress-bar --log-interval 10  \
    --optimizer adam --adam-betas '(0.9, 0.98)' --weight-decay 0.0001 \
    --max-tokens 4000 --update-freq 2 --max-update 50000 \
    --lr-scheduler inverse_sqrt --warmup-updates 4000 --warmup-init-lr '1e-07' --lr 0.002 \
    --min-lr '1e-9' --criterion label_smoothed_cross_entropy --label-smoothing 0.1 \
    --share-decoder-input-output-embed --waitk  $k

See error

Expected behavior

Environment

fairseq Version (e.g., 1.0 or master):
PyTorch Version (e.g., 1.0)
OS (e.g., Linux):
How you installed fairseq (pip, source):
Build command you used (if compiling from source):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Additional context

this repo seems out-of-date and the issue raised half years ago is still no replied.

default.yaml is necessary

Can you upload your yaml config file used in this paper ?
https://arxiv.org/pdf/1808.03867v1.pdf

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

elbayadm / attn2d Goto Github PK

attn2d's People

Contributors

Stargazers

Watchers

Forkers

attn2d's Issues

🐛 Bug

Code sample

Environment

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

Recommend Projects

Recommend Topics

Recommend Org