Git Product home page Git Product logo

fairseq's Introduction



Support Ukraine MIT License Latest Release Build Status Documentation Status CicleCI Status


Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks.

We provide reference implementations of various sequence modeling papers:

List of implemented papers

What's New:

Previous updates

Features:

We also provide pre-trained models for translation and language modeling with a convenient torch.hub interface:

en2de = torch.hub.load('pytorch/fairseq', 'transformer.wmt19.en-de.single_model')
en2de.translate('Hello world', beam=5)
# 'Hallo Welt'

See the PyTorch Hub tutorials for translation and RoBERTa for more examples.

Requirements and Installation

  • PyTorch version >= 1.10.0
  • Python version >= 3.8
  • For training new models, you'll also need an NVIDIA GPU and NCCL
  • To install fairseq and develop locally:
git clone https://github.com/pytorch/fairseq
cd fairseq
pip install --editable ./

# on MacOS:
# CFLAGS="-stdlib=libc++" pip install --editable ./

# to install the latest stable release (0.10.x)
# pip install fairseq
  • For faster training install NVIDIA's apex library:
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" \
  --global-option="--deprecated_fused_adam" --global-option="--xentropy" \
  --global-option="--fast_multihead_attn" ./
  • For large datasets install PyArrow: pip install pyarrow
  • If you use Docker make sure to increase the shared memory size either with --ipc=host or --shm-size as command line options to nvidia-docker run .

Getting Started

The full documentation contains instructions for getting started, training new models and extending fairseq with new model types and tasks.

Pre-trained models and examples

We provide pre-trained models and pre-processed, binarized test sets for several tasks listed below, as well as example training and evaluation commands.

We also have more detailed READMEs to reproduce results from specific papers:

Join the fairseq community

License

fairseq(-py) is MIT-licensed. The license applies to the pre-trained models as well.

Citation

Please cite as:

@inproceedings{ott2019fairseq,
  title = {fairseq: A Fast, Extensible Toolkit for Sequence Modeling},
  author = {Myle Ott and Sergey Edunov and Alexei Baevski and Angela Fan and Sam Gross and Nathan Ng and David Grangier and Michael Auli},
  booktitle = {Proceedings of NAACL-HLT 2019: Demonstrations},
  year = {2019},
}

fairseq's People

Contributors

alexeib avatar cndn avatar davidecaroselli avatar dianaml0 avatar edunov avatar erip avatar freewym avatar huihuifan avatar jhcross avatar jingfeidu avatar joshim5 avatar kahne avatar kartikayk avatar lematt1991 avatar liezl200 avatar liuchen9494 avatar louismartin avatar maigoakisame avatar mortimerp9 avatar multipath avatar myleott avatar pipibjc avatar sravyapopuri388 avatar sshleifer avatar tangyuq avatar theweiho avatar vineelpratap avatar xu-song avatar xutaima avatar yuntang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fairseq's Issues

Segmentation fault

I got a segmentation fault when I tried to translate using interactive.py.

$ MODEL_DIR=wmt14.en-fr.fconv-py
$ python interactive.py --path $MODEL_DIR/model.pt $MODEL_DIR --beam 5
Namespace(beam=5, cpu=False, data='wmt14.en-fr.fconv-py', lenpen=1,
log_format=None, log_interval=1000, max_len_a=0, max_len_b=200,
max_source_positions=1024, max_target_positions=1024, nbest=1,
no_beamable_mm=False, no_early_stop=False, no_progress_bar=False,
path=['wmt14.en-fr.fconv-py/model.pt'], quiet=False, remove_bpe=None,
replace_unk=None, seed=1, skip_invalid_size_inputs_valid_test=False,
source_lang=None, target_lang=None, unkpen=0, unnormalized=False,
workers=1)
| loading model(s) from wmt14.en-fr.fconv-py/model.pt
| [en] dictionary: 44206 types
| [fr] dictionary: 44463 types
| Type the input sentence and press return:
Why is it rare to discover new marine mam@@ mal species ?
Segmentation fault

loss increasing with larger input range

When I multiply the input by sqrt(512), loss always increases, on the order of magnitude of 1e2~1e3 from the beginning of training. i.e., Only change fconv.py:

def Embedding(num_embeddings, embedding_dim, padding_idx):
    m = nn.Embedding(num_embeddings, embedding_dim, padding_idx=padding_idx)
    m.weight.data.normal_(0, 0.1)
    m.weight.data.mul_(math.sqrt(embedding_dim))  # I add it here
    return m

I thought the operation above only change the input's magnitude from 1e-4 to 1e-2, can you tell me why loss exploding? PS: I'm sure loss decreases normally without mul_.


Here is part of the log:

| epoch 001:   0%|                                         | 13/14254 [00:19<5:38:48,  1.43s/it, loss=1075.60 (897.77), wps=4021, wpb=5703, bsz=131, lr=0.25, clip=100%, gnorm=1689590071670.1538]
| epoch 001:   1%|3                                      | 114/14254 [02:44<5:46:16,  1.47s/it, loss=1859.97 (1038.92), wps=3977, wpb=5695, bsz=169, lr=0.25, clip=100%, gnorm=7599585518747.3330]

installation from source requires installing cffi

This is a very minor documentation issue
note: using python3/pip3 as there is a comment about requiring python 3 for fairseq-py
not using anaconda..I have had issues with package consistency..so I avoid it
fairseq-py installed with
git clone https://github.com/facebookresearch/fairseq-py.git
sudo pip3 install -r requirements.txt

levinth@zt-gpu-lin-1:~/fairseq-py$ sudo python3 setup.py build
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/torch/utils/ffi/init.py", line 12, in
import cffi
ImportError: No module named 'cffi'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "setup.py", line 13, in
from torch.utils.ffi import create_extension
File "/usr/local/lib/python3.5/dist-packages/torch/utils/ffi/init.py", line 14, in
raise ImportError("torch.utils.ffi requires the cffi package")
ImportError: torch.utils.ffi requires the cffi package
levinth@zt-gpu-lin-1:~/fairseq-py$ pip3 install cffi

and then the build worked
likely can be fixed by adding cffii to requirements.txt

grad_multiply function use?

`class GradMultiply(torch.autograd.Function):
@staticmethod
def forward(ctx, x, scale):
  ctx.scale = scale
  res = x.new(x)
  ctx.mark_shared_storage((x, res))
  return res

@staticmethod
  def backward(ctx, grad):
  return grad * ctx.scale, None`

on reading the paper i have either completely missed or not come across something where we had to scale the gradient flowing back,
Please point out where i can get clarity on its use and why as to it has positive impact on the result.

Exploding in WMT14 en-fr

Hello.
I've processed my data and set training parameters as the same in pre-trained models/wmt14.en-fr.fconv-py/README.md. However, I get

| [en] dictionary: 43881 types
| [fr] dictionary: 43978 types
| data-bin train 35482842 examples
| data-bin valid 26663 examples
| data-bin test 3003 examples
| using 8 GPUs (with max tokens per GPU = 4000)
| model fconv_wmt_en_fr
Warning! 1 samples are either too short or too long and will be ignored, sample ids=[28743556]
| epoch 001  1000 / 331737 loss=9.57 (10.94), wps=18515, wpb=31259, bsz=861, lr=1.25, clip=100%, gnorm=2.0540
| epoch 001  2000 / 331737 loss=8.61 (9.91), wps=18466, wpb=31229, bsz=877, lr=1.25, clip=100%, gnorm=1.7149
| epoch 001  3000 / 331737 loss=7.50 (9.23), wps=18493, wpb=31226, bsz=871, lr=1.25, clip=100%, gnorm=2.7501
| epoch 001  4000 / 331737 loss=6.87 (8.75), wps=18522, wpb=31231, bsz=873, lr=1.25, clip=100%, gnorm=100615.8788
| epoch 001  5000 / 331737 loss=10405.01 (136.96), wps=18532, wpb=31216, bsz=874, lr=1.25, clip=100%, gnorm=1500459828271.3960
| epoch 001  6000 / 331737 loss=4773454961.36 (92926125.94), wps=18564, wpb=31213, bsz=867, lr=1.25, clip=100%, gnorm=37459419138681.4219
| epoch 001  7000 / 331737 loss=7746569234820.15 (126329286789.38), wps=18577, wpb=31211, bsz=864, lr=1.25, clip=100%, gnorm=inf
| epoch 001  8000 / 331737 loss=18016233617.10 (228909462625.55), wps=18562, wpb=31205, bsz=866, lr=1.25, clip=100%, gnorm=inf
| epoch 001  9000 / 331737 loss=6500325670920.53 (321325856038.58), wps=18597, wpb=31214, bsz=860, lr=1.25, clip=100%, gnorm=inf
| epoch 001 10000 / 331737 loss=11162501170786.86 (715142464195.40), wps=18609, wpb=31219, bsz=858, lr=1.25, clip=100%, gnorm=inf
....

--------------------------ENV----------------------------
P40 8cards

--------------------------DATA PREPROCESSING----------------------------

  1. normalize-punctuation
  2. tokenizer
  3. clean-corpus-n
  4. shuffle
  5. learn and apply bpe
    I've checked en-fr data corresponding relationship after preprocessing.

--------------------------TRAINING PARAMETER----------------------------
fairseq_train_param="-s en -t fr --arch fconv_wmt_en_fr
--dropout 0.1 --lr 1.25 --clip-norm 0.1 --max-tokens 4000 --force-anneal 32"

Can you help me to figure out my problem? Thank you.

fairseq PyTorch setup

I'm trying to run: NO_DISTRIBUTED=1 python setup.py install to install PyTorch.

However I am getting the following error:
...
gcc -pthread -B /home/sarah/miniconda3/envs/fairseq/compiler_compat -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/home/sarah/miniconda3/envs/fairseq/include/python3.6m -c torch/csrc/dl.c -o build/temp.linux-x86_64-3.6/torch/csrc/dl.o
gcc -pthread -shared -B /home/sarah/miniconda3/envs/fairseq/compiler_compat -L/home/sarah/miniconda3/envs/fairseq/lib -Wl,-rpath=/home/sarah/miniconda3/envs/fairseq/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.6/torch/csrc/dl.o -L/home/sarah/miniconda3/envs/fairseq/lib -lpython3.6m -o build/lib.linux-x86_64-3.6/torch/_dl.cpython-36m-x86_64-linux-gnu.so
/home/sarah/miniconda3/envs/fairseq/compiler_compat/ld: cannot find -lpthread
/home/sarah/miniconda3/envs/fairseq/compiler_compat/ld: cannot find -lc
collect2: error: ld returned 1 exit status
error: command 'gcc' failed with exit status 1

How can I resolve this error?

Following https://github.com/facebookresearch/fairseq-py/issues/19

I have checked to make sure Cuda is installed. When I run nvcc --version I get:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61

I have also added cuda headers to build path
export CPATH=/usr/local/cuda-8.0/include

Thanks!

Why not use torch.nn.utils.clip in multiprocessing_trainer.py?

I don't understand why you write your own functions "clip_grads" and "flatten_grads" to clip the gradient instead of using torch.nn.utils.clip. I have read the source code of torch.nn.utils.clip and I think it does exactly the same thing as your functions.
Is it because that your functions is faster than the original one? Could anyone explain the reason?
Thanks!

what does reorder_incremental_state do?

  1. I've seen the descriptions of reorder_incremental_state:

    This should be called when the order of the input has changed from the
    previous time step. A typical use case is beam search, where the input
    order changes between time steps based on the choice of beams.

But i'm still confused at this typical use case, what is the new order in each step? Is it ordered by the cumulative score in the current step?

  1. I've made some changes in fconv.py: add another gru decoder. I.e., I have a fconv decoder and gru decoder to calculate decoder states simultaneously, then concatenate the features they two get, then fully connect to output layer with size N_dictionary.
    My problem is: training loss and valid loss seems to be normal. However, the generation result shows BLEU = 0.43 on training error=4.56. Can you give me some hint on the causes? Is it because gru and fconv share the same reorder_incremental_state function? PS: I've only modified fconv.py

Question about argument "max-token"

Can anyone tell me what does max-token mean?
Is this the same as vocab-size? Or?

"""
Also note that the batch size is specified in terms of the maximum number of tokens per batch (--max-tokens). You may need to use a smaller value depending on the available GPU memory on your system.
"""

How are weigths of layer "LinearizedConvolution" initialized?

I would like to use your Conv S2S model in another task, so I hope to know the details of the model.
I have read the code defining class "LinearizedConvolution" and its parent layer "ConvTBC", however, I haven't seen any code doing weight initialization.

Could you please let me know how weigths of layer "LinearizedConvolution" are initialized?
Thanks!

The epoch counter adds 1 even when loading from an unfinished checkpoint

Hi, I noticed that the epoch counter adds 1 even if it is loading from an unfinished last checkpoint, which saved by save_interval option. This means that if one make multiply re-trainings on multiply checkpoints within one epoch, he/she would get an accumulated count on epoch counter although the epoch itself hasn't increased yet.

I guess the related code is in utils.py file.
epoch = state['epoch'] + 1 # _in the load_checkpoint function_

OS X - Error compiling temporal_convolution_tbc

OS: OS X 10.13 High Sierra
CUDA: v. 9.0

When trying to compile fairseq-py on OSX, I'm getting the error:

error: non-const lvalue reference to type 'at::Tensor' cannot bind to a temporary of type 'int'

I've tried using g++ , with similar results:

error: invalid initialization of non-const reference of type 'at::Tensor&' from an rvalue of type 'int'

Here's the full output:

`
$ CC=clang++ python setup.py build
running build
running build_py
generating /var/folders/1f/z12pg_yx2194lgdvlb_kprb40000gp/T/tmpemf541ha/_temporal_convolution_tbc.c
running build_ext
building '_temporal_convolution_tbc' extension
creating Users
creating Users/mastover
creating Users/mastover/git
creating Users/mastover/git/fairseq-py
creating Users/mastover/git/fairseq-py/fairseq
creating Users/mastover/git/fairseq-py/fairseq/clib
creating Users/mastover/git/fairseq-py/fairseq/clib/temporal_convolution_tbc
clang++ -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -fwrapv -O2 -Wall -Wstrict-prototypes -march=core2 -mtune=haswell -mssse3 -ftree-vectorize -fPIC -fPIE -fstack-protector-strong -O2 -pipe -march=core2 -mtune=haswell -mssse3 -ftree-vectorize -fPIC -fPIE -fstack-protector-strong -O2 -pipe -DWITH_CUDA -I/Users/mastover/anaconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/Users/mastover/anaconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -I/Users/mastover/anaconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/Developer/NVIDIA/CUDA-9.0/include -I/Developer/NVIDIA/CUDA-8.0/include -I/Users/mastover/anaconda3/include/python3.6m -c _temporal_convolution_tbc.c -o ./_temporal_convolution_tbc.o -std=c++11
clang: warning: treating 'c' input as 'c++' when in C++ mode, this behavior is deprecated [-Wdeprecated]
clang++ -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -fwrapv -O2 -Wall -Wstrict-prototypes -march=core2 -mtune=haswell -mssse3 -ftree-vectorize -fPIC -fPIE -fstack-protector-strong -O2 -pipe -march=core2 -mtune=haswell -mssse3 -ftree-vectorize -fPIC -fPIE -fstack-protector-strong -O2 -pipe -DWITH_CUDA -I/Users/mastover/anaconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/Users/mastover/anaconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -I/Users/mastover/anaconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/Developer/NVIDIA/CUDA-9.0/include -I/Developer/NVIDIA/CUDA-8.0/include -I/Users/mastover/anaconda3/include/python3.6m -c /Users/mastover/git/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp -o ./Users/mastover/git/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.o -std=c++11
/Users/mastover/git/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp:70:21: error: non-const lvalue reference to type 'at::Tensor' cannot bind to a temporary of type 'int'
at::addmm_out(1, O, 1, I, W, O);
^
/Users/mastover/anaconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/ATen/Functions.h:1203:43: note: passing argument to parameter 'result' here
static inline Tensor & addmm_out(Tensor & result, const Tensor & self, const Tensor & mat1, const Tensor & mat2, Scalar beta, Scalar alpha) {
^
/Users/mastover/git/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp:111:21: error: non-const lvalue reference to type 'at::Tensor' cannot bind to a temporary of type 'int'
at::addmm_out(1, dI, 1, dO, weight[k].t(), dI);
^
/Users/mastover/anaconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/ATen/Functions.h:1203:43: note: passing argument to parameter 'result' here
static inline Tensor & addmm_out(Tensor & result, const Tensor & self, const Tensor & mat1, const Tensor & mat2, Scalar beta, Scalar alpha) {
^
/Users/mastover/git/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp:124:21: error: non-const lvalue reference to type 'at::Tensor' cannot bind to a temporary of type 'int'
at::addmm_out(1, dW, 1, I, dO, dW);
^
/Users/mastover/anaconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/ATen/Functions.h:1203:43: note: passing argument to parameter 'result' here
static inline Tensor & addmm_out(Tensor & result, const Tensor & self, const Tensor & mat1, const Tensor & mat2, Scalar beta, Scalar alpha) {
^
/Users/mastover/git/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp:129:20: error: reference to type 'const at::Tensor' could not bind to an rvalue of type 'int'
at::sum_out(tmp, 0, dBias);
^
/Users/mastover/anaconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/ATen/Functions.h:1044:64: note: passing argument to parameter 'self' here
static inline Tensor & sum_out(Tensor & result, const Tensor & self, int64_t dim, bool keepdim) {
^
4 errors generated.
Traceback (most recent call last):
File "/Users/mastover/anaconda3/lib/python3.6/distutils/unixccompiler.py", line 118, in _compile
extra_postargs)
File "/Users/mastover/anaconda3/lib/python3.6/distutils/ccompiler.py", line 909, in spawn
spawn(cmd, dry_run=self.dry_run)
File "/Users/mastover/anaconda3/lib/python3.6/distutils/spawn.py", line 36, in spawn
_spawn_posix(cmd, search_path, dry_run=dry_run)
File "/Users/mastover/anaconda3/lib/python3.6/distutils/spawn.py", line 159, in _spawn_posix
% (cmd, exit_status))
distutils.errors.DistutilsExecError: command 'clang++' failed with exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/Users/mastover/anaconda3/lib/python3.6/site-packages/cffi/ffiplatform.py", line 49, in _build
dist.run_command('build_ext')
File "/Users/mastover/anaconda3/lib/python3.6/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "/Users/mastover/anaconda3/lib/python3.6/site-packages/setuptools/command/build_ext.py", line 75, in run
_build_ext.run(self)
File "/Users/mastover/anaconda3/lib/python3.6/site-packages/Cython/Distutils/old_build_ext.py", line 185, in run
_build_ext.build_ext.run(self)
File "/Users/mastover/anaconda3/lib/python3.6/distutils/command/build_ext.py", line 339, in run
self.build_extensions()
File "/Users/mastover/anaconda3/lib/python3.6/site-packages/Cython/Distutils/old_build_ext.py", line 193, in build_extensions
self.build_extension(ext)
File "/Users/mastover/anaconda3/lib/python3.6/site-packages/setuptools/command/build_ext.py", line 196, in build_extension
_build_ext.build_extension(self, ext)
File "/Users/mastover/anaconda3/lib/python3.6/distutils/command/build_ext.py", line 533, in build_extension
depends=ext.depends)
File "/Users/mastover/anaconda3/lib/python3.6/distutils/ccompiler.py", line 574, in compile
self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
File "/Users/mastover/anaconda3/lib/python3.6/distutils/unixccompiler.py", line 120, in _compile
raise CompileError(msg)
distutils.errors.CompileError: command 'clang++' failed with exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "setup.py", line 69, in
'build_py': build_py_hook,
File "/Users/mastover/anaconda3/lib/python3.6/distutils/core.py", line 148, in setup
dist.run_commands()
File "/Users/mastover/anaconda3/lib/python3.6/distutils/dist.py", line 955, in run_commands
self.run_command(cmd)
File "/Users/mastover/anaconda3/lib/python3.6/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "/Users/mastover/anaconda3/lib/python3.6/distutils/command/build.py", line 135, in run
self.run_command(cmd_name)
File "/Users/mastover/anaconda3/lib/python3.6/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/Users/mastover/anaconda3/lib/python3.6/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "setup.py", line 49, in run
conv_tbc.build()
File "/Users/mastover/anaconda3/lib/python3.6/site-packages/torch/utils/ffi/init.py", line 164, in build
_build_extension(ffi, cffi_wrapper_name, target_dir, verbose)
File "/Users/mastover/anaconda3/lib/python3.6/site-packages/torch/utils/ffi/init.py", line 100, in _build_extension
ffi.compile(tmpdir=tmpdir, verbose=verbose, target=libname)
File "/Users/mastover/anaconda3/lib/python3.6/site-packages/cffi/api.py", line 684, in compile
compiler_verbose=verbose, debug=debug, **kwds)
File "/Users/mastover/anaconda3/lib/python3.6/site-packages/cffi/recompiler.py", line 1484, in recompile
compiler_verbose, debug)
File "/Users/mastover/anaconda3/lib/python3.6/site-packages/cffi/ffiplatform.py", line 20, in compile
outputfilename = _build(tmpdir, ext, compiler_verbose, debug)
File "/Users/mastover/anaconda3/lib/python3.6/site-packages/cffi/ffiplatform.py", line 56, in _build
raise VerificationError('%s: %s' % (e.class.name, e))
cffi.error.VerificationError: CompileError: command 'clang++' failed with exit status 1
`

Question about sample from dataloader

There are many places like the code below.
Can I only get partial data by itr or there is any good method?

I want to sample just 40% of all data for training, and get data batch by batch.

Thank you.

itr = dataset.dataloader(args.train_subset, num_workers=args.workers,
                    max_tokens=args.max_tokens, seed=args.seed, epoch=epoch,
                    max_positions=args.max_positions,
                    sample_without_replacement=args.sample_without_replacement,
                    skip_invalid_size_inputs_valid_test=args.skip_invalid_size_inputs_valid_test)

num_sentences = 0

with progress_bar(itr, smoothing=0, leave=False) as t:
    wps_meter = TimeMeter()
    gen_timer = StopwatchMeter()

The no_progress_bar option doesn't work

Hi, on two of my computer (Ubuntu 16.04 and14.04), the no-progress-bar option does not work as expected, the program still prints a new line for each iteration. This option only works on another Debian system computer, where the script prints the status according to the log-interval option.

And while I keep the progress bar enabled (no-progress-bar=False), and replace progress_bar in the train.py file with the native tqdm object (with options, as demonstrated below), a very fast print of the status can still be observed.

 with progress_bar(itr, desc, leave=False) as t: #original
 with tqdm(itr, desc, leave=False,miniters=1000,mininterval=10.0) as t: #replaced

I suspect that could be an issue of the tqdm tool. My current solution is to add an option refresh=False to the "t.set_postfix(collections.OrderedDict)" part in the train.py file.
t.set_postfix(collections.OrderedDict(),refresh=False)
After that, there is no output of the progress-bar at every iteration, which is the expected behavior when no_progress_bar option is True.

Could anyone suggest the possible reason for this issue? Thanks.

ERROR: missing temporal_convolution_tbc, run `python setup.py install`

Hello,I am a green hand.when i use
CUDA_VISIBLE_DEVICES=6 python train.py data-bin/iwslt14.tokenized.de-en --lr 0.25 --clip-norm 0.1 --dropout 0.2 --max-tokens 4000 --arch fconv_iwslt_de_en --save-dir checkpoints/fconv

I meet a error .

ERROR: missing temporal_convolution_tbc, run python setup.py install
Traceback (most recent call last):
File "train.py", line 14, in
from fairseq import bleu, data, options, utils
File "/home/hfyu/fairseq16/fairseq-py-master/fairseq/options.py", line 11, in
from fairseq import models
File "/home/hfyu/fairseq16/fairseq-py-master/fairseq/models/init.py", line 9, in
from . import fconv
File "/home/hfyu/fairseq16/fairseq-py-master/fairseq/models/fconv.py", line 14, in
from fairseq.modules import BeamableMM, LinearizedConvolution
File "/home/hfyu/fairseq16/fairseq-py-master/fairseq/modules/init.py", line 10, in
from .conv_tbc import ConvTBC
File "/home/hfyu/fairseq16/fairseq-py-master/fairseq/modules/conv_tbc.py", line 18, in
raise e
File "/home/hfyu/fairseq16/fairseq-py-master/fairseq/modules/conv_tbc.py", line 14, in
from fairseq import temporal_convolution_tbc
File "/home/hfyu/fairseq16/fairseq-py-master/fairseq/temporal_convolution_tbc/init.py", line 3, in
from ._temporal_convolution_tbc import lib as _lib, ffi as _ffi
ImportError: /home/hfyu/fairseq16/fairseq-py-master/fairseq/temporal_convolution_tbc/_temporal_convolution_tbc.so: undefined symbol: _ZNK2at4Type4copyERKNS_6TensorERS1

I donnot know how to deal with it ,Could you help me?Thankyou.

fatal error: ATen/ATen.h: No such file or directory

After running :
python setup.py build
I get:

running build
running build_py
generating /tmp/tmpz4hbar33/_temporal_convolution_tbc.c
running build_ext
building '_temporal_convolution_tbc' extension
creating home
creating home/vlad
creating home/vlad/work
creating home/vlad/work/seq2seq
creating home/vlad/work/seq2seq/fairseq-py
creating home/vlad/work/seq2seq/fairseq-py/fairseq
creating home/vlad/work/seq2seq/fairseq-py/fairseq/clib
creating home/vlad/work/seq2seq/fairseq-py/fairseq/clib/temporal_convolution_tbc
gcc -pthread -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/home/vlad/anaconda3/envs/seq2seqTorch/include/python3.6m -c _temporal_convolution_tbc.c -o ./_temporal_convolution_tbc.o -std=c++11
cc1: warning: command line option ‘-std=c++11’ is valid for C++/ObjC++ but not for C
gcc -pthread -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/home/vlad/anaconda3/envs/seq2seqTorch/include/python3.6m -c /home/vlad/work/seq2seq/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp -o ./home/vlad/work/seq2seq/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.o -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/home/vlad/work/seq2seq/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp:12:23: fatal error: ATen/ATen.h: No such file or directory
compilation terminated.
Traceback (most recent call last):
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/distutils/unixccompiler.py", line 118, in _compile
    extra_postargs)
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/distutils/ccompiler.py", line 909, in spawn
    spawn(cmd, dry_run=self.dry_run)
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/distutils/spawn.py", line 36, in spawn
    _spawn_posix(cmd, search_path, dry_run=dry_run)
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/distutils/spawn.py", line 159, in _spawn_posix
    % (cmd, exit_status))
distutils.errors.DistutilsExecError: command 'gcc' failed with exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/site-packages/cffi/ffiplatform.py", line 49, in _build
    dist.run_command('build_ext')
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/site-packages/setuptools-27.2.0-py3.6.egg/setuptools/command/build_ext.py", line 77, in run
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/site-packages/Cython/Distutils/old_build_ext.py", line 185, in run
    _build_ext.build_ext.run(self)
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/distutils/command/build_ext.py", line 339, in run
    self.build_extensions()
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/site-packages/Cython/Distutils/old_build_ext.py", line 193, in build_extensions
    self.build_extension(ext)
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/site-packages/setuptools-27.2.0-py3.6.egg/setuptools/command/build_ext.py", line 198, in build_extension
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/distutils/command/build_ext.py", line 533, in build_extension
    depends=ext.depends)
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/distutils/ccompiler.py", line 574, in compile
    self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/distutils/unixccompiler.py", line 120, in _compile
    raise CompileError(msg)
distutils.errors.CompileError: command 'gcc' failed with exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "setup.py", line 69, in <module>
    'build_py': build_py_hook,
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/distutils/dist.py", line 955, in run_commands
    self.run_command(cmd)
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/distutils/command/build.py", line 135, in run
    self.run_command(cmd_name)
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "setup.py", line 49, in run
    conv_tbc.build()
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/site-packages/torch/utils/ffi/__init__.py", line 164, in build
    _build_extension(ffi, cffi_wrapper_name, target_dir, verbose)
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/site-packages/torch/utils/ffi/__init__.py", line 100, in _build_extension
    ffi.compile(tmpdir=tmpdir, verbose=verbose, target=libname)
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/site-packages/cffi/api.py", line 684, in compile
    compiler_verbose=verbose, debug=debug, **kwds)
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/site-packages/cffi/recompiler.py", line 1484, in recompile
    compiler_verbose, debug)
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/site-packages/cffi/ffiplatform.py", line 20, in compile
    outputfilename = _build(tmpdir, ext, compiler_verbose, debug)
  File "/home/vlad/anaconda3/envs/seq2seqTorch/lib/python3.6/site-packages/cffi/ffiplatform.py", line 56, in _build
    raise VerificationError('%s: %s' % (e.__class__.__name__, e))
cffi.error.VerificationError: CompileError: command 'gcc' failed with exit status 1

CUDA

cat /usr/local/cuda/version.txt
CUDA Version 8.0.61

Is the padding use zero-embedding?

As Data.py shows you used padding to get a [batchsize, max_sentence_length] token_input (in class PaddingCollater)

And in Fconv.py, you use Embedding without specifying the embedding of <pad> is an all zero vector.

Am I missing something? Or should padding be zero-embedding? Or doesn't matter?

Thanks.

_temporal_convolution_tbc.c:434:21: fatal error: THC/THC.h: No such file or directory

Hi, when I'm trying to install following your instruction, I got that error.
During: python setup.py build

This is what I got:

running install
running bdist_egg
running egg_info
writing fairseq.egg-info/PKG-INFO
writing dependency_links to fairseq.egg-info/dependency_links.txt
writing requirements to fairseq.egg-info/requires.txt
writing top-level names to fairseq.egg-info/top_level.txt
reading manifest file 'fairseq.egg-info/SOURCES.txt'
writing manifest file 'fairseq.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
generating /tmp/tmpj5grradm/_temporal_convolution_tbc.c
running build_ext
building '_temporal_convolution_tbc' extension
creating nfs
creating nfs/guille
creating nfs/guille/huang
creating nfs/guille/huang/users
creating nfs/guille/huang/users/yangyil
creating nfs/guille/huang/users/yangyil/fairseq-py
creating nfs/guille/huang/users/yangyil/fairseq-py/fairseq
creating nfs/guille/huang/users/yangyil/fairseq-py/fairseq/clib
creating nfs/guille/huang/users/yangyil/fairseq-py/fairseq/clib/temporal_convolution_tbc
gcc -pthread -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -I/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC -I/nfs/stak/users/yangyil/shared/miniconda3/include/python3.6m -c _temporal_convolution_tbc.c -o ./_temporal_convolution_tbc.o -std=c++11
cc1: warning: command line option ‘-std=c++11’ is valid for C++/ObjC++ but not for C [enabled by default]
_temporal_convolution_tbc.c:434:21: fatal error: THC/THC.h: No such file or directory
 #include <THC/THC.h>
                     ^
compilation terminated.
Traceback (most recent call last):
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/distutils/unixccompiler.py", line 118, in _compile
    extra_postargs)
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/distutils/ccompiler.py", line 909, in spawn
    spawn(cmd, dry_run=self.dry_run)
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/distutils/spawn.py", line 36, in spawn
    _spawn_posix(cmd, search_path, dry_run=dry_run)
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/distutils/spawn.py", line 159, in _spawn_posix
    % (cmd, exit_status))
distutils.errors.DistutilsExecError: command 'gcc' failed with exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/site-packages/cffi/ffiplatform.py", line 49, in _build
    dist.run_command('build_ext')
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/site-packages/setuptools-27.2.0-py3.6.egg/setuptools/command/build_ext.py", line 77, in run
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/distutils/command/build_ext.py", line 339, in run
    self.build_extensions()
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/distutils/command/build_ext.py", line 448, in build_extensions
    self._build_extensions_serial()
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/distutils/command/build_ext.py", line 473, in _build_extensions_serial
    self.build_extension(ext)
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/site-packages/setuptools-27.2.0-py3.6.egg/setuptools/command/build_ext.py", line 198, in build_extension
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/distutils/command/build_ext.py", line 533, in build_extension
    depends=ext.depends)
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/distutils/ccompiler.py", line 574, in compile
    self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/distutils/unixccompiler.py", line 120, in _compile
    raise CompileError(msg)
distutils.errors.CompileError: command 'gcc' failed with exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "setup.py", line 69, in <module>
    'build_py': build_py_hook,
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/distutils/dist.py", line 955, in run_commands
    self.run_command(cmd)
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/site-packages/setuptools-27.2.0-py3.6.egg/setuptools/command/install.py", line 67, in run
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/site-packages/setuptools-27.2.0-py3.6.egg/setuptools/command/install.py", line 109, in do_egg_install
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/site-packages/setuptools-27.2.0-py3.6.egg/setuptools/command/bdist_egg.py", line 161, in run
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/site-packages/setuptools-27.2.0-py3.6.egg/setuptools/command/bdist_egg.py", line 147, in call_command
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/site-packages/setuptools-27.2.0-py3.6.egg/setuptools/command/install_lib.py", line 11, in run
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/distutils/command/install_lib.py", line 105, in build
    self.run_command('build_py')
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "setup.py", line 49, in run
    conv_tbc.build()
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/__init__.py", line 164, in build
    _build_extension(ffi, cffi_wrapper_name, target_dir, verbose)
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/__init__.py", line 100, in _build_extension
    ffi.compile(tmpdir=tmpdir, verbose=verbose, target=libname)
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/site-packages/cffi/api.py", line 684, in compile
    compiler_verbose=verbose, debug=debug, **kwds)
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/site-packages/cffi/recompiler.py", line 1484, in recompile
    compiler_verbose, debug)
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/site-packages/cffi/ffiplatform.py", line 20, in compile
    outputfilename = _build(tmpdir, ext, compiler_verbose, debug)
  File "/nfs/stak/users/yangyil/shared/miniconda3/lib/python3.6/site-packages/cffi/ffiplatform.py", line 56, in _build
    raise VerificationError('%s: %s' % (e.__class__.__name__, e))
cffi.error.VerificationError: CompileError: command 'gcc' failed with exit status 1

I build pytorch from the source, and reset it as you indicate.
This is all the library I installed in this environment.

$ conda list
# packages in environment at /nfs/guille/huang/users/yangyil/miniconda3:
#
asn1crypto                0.22.0                   py36_0
cffi                      1.10.0                   py36_0
cloog                     0.18.0                        0
cmake                     0.8.0                     <pip>
conda                     4.3.25                   py36_0
conda-env                 2.6.0                         0
cryptography              1.8.1                    py36_0
cuda80                    1.0                           0    soumith
cudatoolkit               8.0                           1
cudnn                     6.0.21                cuda8.0_0
gcc                       4.8.5                         7
gmp                       6.1.0                         0
idna                      2.5                      py36_0
isl                       0.12.2                        0
libffi                    3.2.1                         1
magma-cuda80              2.1.0                         5    soumith
mkl                       2017.0.3                      0
mpc                       1.0.3                         0
mpfr                      3.1.5                         0
nccl                      1.3.4                 cuda8.0_1
numpy                     1.13.1                   py36_0
openssl                   1.0.2l                        0
packaging                 16.8                     py36_0
pip                       9.0.1                    py36_1
pycosat                   0.6.2                    py36_0
pycparser                 2.17                     py36_0
pyopenssl                 17.0.0                   py36_0
pyparsing                 2.1.4                    py36_0
python                    3.6.1                         2
PyYAML                    3.12                      <pip>
readline                  6.2                           2
requests                  2.14.2                   py36_0
ruamel_yaml               0.11.14                  py36_1
setuptools                27.2.0                   py36_0
six                       1.10.0                   py36_0
sqlite                    3.13.0                        0
tk                        8.5.18                        0
torch                     0.2.0+a03e5cb             <pip>
tqdm                      4.17.1                    <pip>
wheel                     0.29.0                   py36_0
xz                        5.2.2                         1
yaml                      0.1.6                         0
zlib                      1.2.8                         3

bad syntax in preprocess.py

levinth@zt-gpu-lin-1:~/fairseq-py$ python3 preprocess.py --source-lang de --target-lang en --trainpref $TEXT/train --validpref $TEXT/valid --testpref $TEXT/test --destdir data-bin/iwslt14.tokenized.de-en
File "preprocess.py", line 84
output_text_file = os.path.join(args.destdir, f'{output_prefix}.{lang}')
^
SyntaxError: invalid syntax

replacing line 84 with
output_text_file = os.path.join(args.destdir, '{output_prefix}.{lang}')
appears to have fixed the issue

Conv1d instead of LinearizedConvolution and convTBC

I am trying to implement a simpler version of this code and am finding it very difficult to see how linearizedConv and ConvTBC is used instead of simply conv1d , a intuition why such a task is done and why there are two different scheme is used in encoder and decoder.

Can you answer this question

when i install fairseq-py ,there is a error,my system is ubuntu, @myleott @colesbury

python setup.py build
running build
running build_py
generating /tmp/tmp5mo3c81l/_temporal_convolution_tbc.c
running build_ext
building '_temporal_convolution_tbc' extension
creating root
creating root/fairseq-py
creating root/fairseq-py/fairseq
creating root/fairseq-py/fairseq/clib
creating root/fairseq-py/fairseq/clib/temporal_convolution_tbc
gcc -pthread -B /root/miniconda3/compiler_compat -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -I/root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/root/miniconda3/include/python3.6m -c _temporal_convolution_tbc.c -o ./_temporal_convolution_tbc.o -std=c++11
cc1: warning: command line option ‘-std=c++11’ is valid for C++/ObjC++ but not for C [enabled by default]
gcc -pthread -B /root/miniconda3/compiler_compat -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -I/root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/root/miniconda3/include/python3.6m -c /root/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp -o ./root/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.o -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ [enabled by default]
/root/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp: In function ‘void TemporalConvolutionTBC_forward(const char*, void*, void*, void*, void*)’:
/root/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp:70:37: error: invalid initialization of non-const reference of type ‘at::Tensor&’ from an rvalue of type ‘int’
at::addmm_out(1, O, 1, I, W, O);
^
In file included from /root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/ATen/ATen.h:10:0,
from /root/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp:12:
/root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/ATen/Functions.h:1270:24: error: in passing argument 1 of ‘at::Tensor& at::addmm_out(at::Tensor&, const at::Tensor&, const at::Tensor&, const at::Tensor&, at::Scalar, at::Scalar)’
static inline Tensor & addmm_out(Tensor & result, const Tensor & self, const Tensor & mat1, const Tensor & mat2, Scalar beta, Scalar alpha) {
^
/root/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp: In function ‘void TemporalConvolutionTBC_backward(const char*, void*, void*, void*, void*, void*, void*)’:
/root/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp:111:52: error: invalid initialization of non-const reference of type ‘at::Tensor&’ from an rvalue of type ‘int’
at::addmm_out(1, dI, 1, dO, weight[k].t(), dI);
^
In file included from /root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/ATen/ATen.h:10:0,
from /root/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp:12:
/root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/ATen/Functions.h:1270:24: error: in passing argument 1 of ‘at::Tensor& at::addmm_out(at::Tensor&, const at::Tensor&, const at::Tensor&, const at::Tensor&, at::Scalar, at::Scalar)’
static inline Tensor & addmm_out(Tensor & result, const Tensor & self, const Tensor & mat1, const Tensor & mat2, Scalar beta, Scalar alpha) {
^
/root/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp:124:40: error: invalid initialization of non-const reference of type ‘at::Tensor&’ from an rvalue of type ‘int’
at::addmm_out(1, dW, 1, I, dO, dW);
^
In file included from /root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/ATen/ATen.h:10:0,
from /root/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp:12:
/root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/ATen/Functions.h:1270:24: error: in passing argument 1 of ‘at::Tensor& at::addmm_out(at::Tensor&, const at::Tensor&, const at::Tensor&, const at::Tensor&, at::Scalar, at::Scalar)’
static inline Tensor & addmm_out(Tensor & result, const Tensor & self, const Tensor & mat1, const Tensor & mat2, Scalar beta, Scalar alpha) {
^
/root/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp:129:28: error: invalid initialization of reference of type ‘const at::Tensor&’ from expression of type ‘int’
at::sum_out(tmp, 0, dBias);
^
In file included from /root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/ATen/ATen.h:10:0,
from /root/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp:12:
/root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/ATen/Functions.h:1111:24: error: in passing argument 2 of ‘at::Tensor& at::sum_out(at::Tensor&, const at::Tensor&, int64_t, bool)’
static inline Tensor & sum_out(Tensor & result, const Tensor & self, int64_t dim, bool keepdim) {
^
Traceback (most recent call last):
File "/root/miniconda3/lib/python3.6/distutils/unixccompiler.py", line 118, in _compile
extra_postargs)
File "/root/miniconda3/lib/python3.6/distutils/ccompiler.py", line 909, in spawn
spawn(cmd, dry_run=self.dry_run)
File "/root/miniconda3/lib/python3.6/distutils/spawn.py", line 36, in spawn
_spawn_posix(cmd, search_path, dry_run=dry_run)
File "/root/miniconda3/lib/python3.6/distutils/spawn.py", line 159, in _spawn_posix
% (cmd, exit_status))
distutils.errors.DistutilsExecError: command 'gcc' failed with exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/root/miniconda3/lib/python3.6/site-packages/cffi/ffiplatform.py", line 49, in _build
dist.run_command('build_ext')
File "/root/miniconda3/lib/python3.6/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "/root/miniconda3/lib/python3.6/site-packages/setuptools/command/build_ext.py", line 75, in run
_build_ext.run(self)
File "/root/miniconda3/lib/python3.6/distutils/command/build_ext.py", line 339, in run
self.build_extensions()
File "/root/miniconda3/lib/python3.6/distutils/command/build_ext.py", line 448, in build_extensions
self._build_extensions_serial()
File "/root/miniconda3/lib/python3.6/distutils/command/build_ext.py", line 473, in _build_extensions_serial
self.build_extension(ext)
File "/root/miniconda3/lib/python3.6/site-packages/setuptools/command/build_ext.py", line 196, in build_extension
_build_ext.build_extension(self, ext)
File "/root/miniconda3/lib/python3.6/distutils/command/build_ext.py", line 533, in build_extension
depends=ext.depends)
File "/root/miniconda3/lib/python3.6/distutils/ccompiler.py", line 574, in compile
self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
File "/root/miniconda3/lib/python3.6/distutils/unixccompiler.py", line 120, in _compile
raise CompileError(msg)
distutils.errors.CompileError: command 'gcc' failed with exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "setup.py", line 69, in
'build_py': build_py_hook,
File "/root/miniconda3/lib/python3.6/distutils/core.py", line 148, in setup
dist.run_commands()
File "/root/miniconda3/lib/python3.6/distutils/dist.py", line 955, in run_commands
self.run_command(cmd)
File "/root/miniconda3/lib/python3.6/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "/root/miniconda3/lib/python3.6/distutils/command/build.py", line 135, in run
self.run_command(cmd_name)
File "/root/miniconda3/lib/python3.6/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/root/miniconda3/lib/python3.6/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "setup.py", line 49, in run
conv_tbc.build()
File "/root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/init.py", line 167, in build
_build_extension(ffi, cffi_wrapper_name, target_dir, verbose)
File "/root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/init.py", line 103, in _build_extension
ffi.compile(tmpdir=tmpdir, verbose=verbose, target=libname)
File "/root/miniconda3/lib/python3.6/site-packages/cffi/api.py", line 684, in compile
compiler_verbose=verbose, debug=debug, **kwds)
File "/root/miniconda3/lib/python3.6/site-packages/cffi/recompiler.py", line 1484, in recompile
compiler_verbose, debug)
File "/root/miniconda3/lib/python3.6/site-packages/cffi/ffiplatform.py", line 20, in compile
outputfilename = _build(tmpdir, ext, compiler_verbose, debug)
File "/root/miniconda3/lib/python3.6/site-packages/cffi/ffiplatform.py", line 56, in _build
raise VerificationError('%s: %s' % (e.class.name, e))
cffi.error.VerificationError: CompileError: command 'gcc' failed with exit status 1

Why not data_parallel?

I wonder why you implemented the multi GPU training using a custom Event Loop instead of using torch.nn.DataParallel. I suppose it is for performance reasons?
If so, what is the main bottleneck in data_parallel that prevents you from using it? Do you have an estimate of how much the speed up compared to the (simpler) DataParallel solution is?

install fairseq-py

when i install fairseq-py ,there is a error,my system is ubuntu, @myleott @colesbury

python setup.py build
running build
running build_py
generating /tmp/tmp5mo3c81l/_temporal_convolution_tbc.c
running build_ext
building '_temporal_convolution_tbc' extension
creating root
creating root/fairseq-py
creating root/fairseq-py/fairseq
creating root/fairseq-py/fairseq/clib
creating root/fairseq-py/fairseq/clib/temporal_convolution_tbc
gcc -pthread -B /root/miniconda3/compiler_compat -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -I/root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/root/miniconda3/include/python3.6m -c _temporal_convolution_tbc.c -o ./_temporal_convolution_tbc.o -std=c++11
cc1: warning: command line option ‘-std=c++11’ is valid for C++/ObjC++ but not for C [enabled by default]
gcc -pthread -B /root/miniconda3/compiler_compat -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -I/root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/root/miniconda3/include/python3.6m -c /root/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp -o ./root/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.o -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ [enabled by default]
/root/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp: In function ‘void TemporalConvolutionTBC_forward(const char*, void*, void*, void*, void*)’:
/root/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp:70:37: error: invalid initialization of non-const reference of type ‘at::Tensor&’ from an rvalue of type ‘int’
at::addmm_out(1, O, 1, I, W, O);
^
In file included from /root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/ATen/ATen.h:10:0,
from /root/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp:12:
/root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/ATen/Functions.h:1270:24: error: in passing argument 1 of ‘at::Tensor& at::addmm_out(at::Tensor&, const at::Tensor&, const at::Tensor&, const at::Tensor&, at::Scalar, at::Scalar)’
static inline Tensor & addmm_out(Tensor & result, const Tensor & self, const Tensor & mat1, const Tensor & mat2, Scalar beta, Scalar alpha) {
^
/root/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp: In function ‘void TemporalConvolutionTBC_backward(const char*, void*, void*, void*, void*, void*, void*)’:
/root/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp:111:52: error: invalid initialization of non-const reference of type ‘at::Tensor&’ from an rvalue of type ‘int’
at::addmm_out(1, dI, 1, dO, weight[k].t(), dI);
^
In file included from /root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/ATen/ATen.h:10:0,
from /root/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp:12:
/root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/ATen/Functions.h:1270:24: error: in passing argument 1 of ‘at::Tensor& at::addmm_out(at::Tensor&, const at::Tensor&, const at::Tensor&, const at::Tensor&, at::Scalar, at::Scalar)’
static inline Tensor & addmm_out(Tensor & result, const Tensor & self, const Tensor & mat1, const Tensor & mat2, Scalar beta, Scalar alpha) {
^
/root/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp:124:40: error: invalid initialization of non-const reference of type ‘at::Tensor&’ from an rvalue of type ‘int’
at::addmm_out(1, dW, 1, I, dO, dW);
^
In file included from /root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/ATen/ATen.h:10:0,
from /root/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp:12:
/root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/ATen/Functions.h:1270:24: error: in passing argument 1 of ‘at::Tensor& at::addmm_out(at::Tensor&, const at::Tensor&, const at::Tensor&, const at::Tensor&, at::Scalar, at::Scalar)’
static inline Tensor & addmm_out(Tensor & result, const Tensor & self, const Tensor & mat1, const Tensor & mat2, Scalar beta, Scalar alpha) {
^
/root/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp:129:28: error: invalid initialization of reference of type ‘const at::Tensor&’ from expression of type ‘int’
at::sum_out(tmp, 0, dBias);
^
In file included from /root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/ATen/ATen.h:10:0,
from /root/fairseq-py/fairseq/clib/temporal_convolution_tbc/temporal_convolution_tbc.cpp:12:
/root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/ATen/Functions.h:1111:24: error: in passing argument 2 of ‘at::Tensor& at::sum_out(at::Tensor&, const at::Tensor&, int64_t, bool)’
static inline Tensor & sum_out(Tensor & result, const Tensor & self, int64_t dim, bool keepdim) {
^
Traceback (most recent call last):
File "/root/miniconda3/lib/python3.6/distutils/unixccompiler.py", line 118, in _compile
extra_postargs)
File "/root/miniconda3/lib/python3.6/distutils/ccompiler.py", line 909, in spawn
spawn(cmd, dry_run=self.dry_run)
File "/root/miniconda3/lib/python3.6/distutils/spawn.py", line 36, in spawn
_spawn_posix(cmd, search_path, dry_run=dry_run)
File "/root/miniconda3/lib/python3.6/distutils/spawn.py", line 159, in _spawn_posix
% (cmd, exit_status))
distutils.errors.DistutilsExecError: command 'gcc' failed with exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/root/miniconda3/lib/python3.6/site-packages/cffi/ffiplatform.py", line 49, in _build
dist.run_command('build_ext')
File "/root/miniconda3/lib/python3.6/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "/root/miniconda3/lib/python3.6/site-packages/setuptools/command/build_ext.py", line 75, in run
_build_ext.run(self)
File "/root/miniconda3/lib/python3.6/distutils/command/build_ext.py", line 339, in run
self.build_extensions()
File "/root/miniconda3/lib/python3.6/distutils/command/build_ext.py", line 448, in build_extensions
self._build_extensions_serial()
File "/root/miniconda3/lib/python3.6/distutils/command/build_ext.py", line 473, in _build_extensions_serial
self.build_extension(ext)
File "/root/miniconda3/lib/python3.6/site-packages/setuptools/command/build_ext.py", line 196, in build_extension
_build_ext.build_extension(self, ext)
File "/root/miniconda3/lib/python3.6/distutils/command/build_ext.py", line 533, in build_extension
depends=ext.depends)
File "/root/miniconda3/lib/python3.6/distutils/ccompiler.py", line 574, in compile
self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
File "/root/miniconda3/lib/python3.6/distutils/unixccompiler.py", line 120, in _compile
raise CompileError(msg)
distutils.errors.CompileError: command 'gcc' failed with exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "setup.py", line 69, in
'build_py': build_py_hook,
File "/root/miniconda3/lib/python3.6/distutils/core.py", line 148, in setup
dist.run_commands()
File "/root/miniconda3/lib/python3.6/distutils/dist.py", line 955, in run_commands
self.run_command(cmd)
File "/root/miniconda3/lib/python3.6/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "/root/miniconda3/lib/python3.6/distutils/command/build.py", line 135, in run
self.run_command(cmd_name)
File "/root/miniconda3/lib/python3.6/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/root/miniconda3/lib/python3.6/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "setup.py", line 49, in run
conv_tbc.build()
File "/root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/init.py", line 167, in build
_build_extension(ffi, cffi_wrapper_name, target_dir, verbose)
File "/root/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/init.py", line 103, in _build_extension
ffi.compile(tmpdir=tmpdir, verbose=verbose, target=libname)
File "/root/miniconda3/lib/python3.6/site-packages/cffi/api.py", line 684, in compile
compiler_verbose=verbose, debug=debug, **kwds)
File "/root/miniconda3/lib/python3.6/site-packages/cffi/recompiler.py", line 1484, in recompile
compiler_verbose, debug)
File "/root/miniconda3/lib/python3.6/site-packages/cffi/ffiplatform.py", line 20, in compile
outputfilename = _build(tmpdir, ext, compiler_verbose, debug)
File "/root/miniconda3/lib/python3.6/site-packages/cffi/ffiplatform.py", line 56, in _build
raise VerificationError('%s: %s' % (e.class.name, e))
cffi.error.VerificationError: CompileError: command 'gcc' failed with exit status 1

"python setup.py build" not fails with UnicodeDecodeError

Hi,
I'm following the setup instructions as in the README.md here: https://github.com/facebookresearch/fairseq-py#requirements-and-installation
Was able to get pytorch installed from source at commit-id: a03e5cb4, as per this readme. After this, I clone the fairseq-py repo. Also was able to install requirements for this repo. But, when I try to build and install fairseq-py, I see the following error:

$ python setup.py build
Traceback (most recent call last):
  File "setup.py", line 19, in <module>
    readme = f.read()
  File "/opt/conda/lib/python3.5/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 4007: ordinal not in range(128)

The issue is with line 89 in the README.md having unicode chars. This seems to be causing the error in installation. Simplest hack that took me past this error was to replace this line.

$ grep -v Pourquoi README.md > README.md.bak
$ mv README.md.bak README.md

After this, I was able to install fairseq-py successfully.

FWIW, my full system info is given at the end.

Regards,
Thejaswi

My system info:

$ uname -a
Linux 92bc62ee5c27 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

$ lsb_release -a
LSB Version:    core-9.20160110ubuntu0.2-amd64:core-9.20160110ubuntu0.2-noarch:security-9.20160110ubuntu0.2-amd64:security-9.20160110ubuntu0.2-noarch
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.2 LTS
Release:        16.04
Codename:       xenial

$ python --version
Python 3.5.2 :: Continuum Analytics, Inc.

$ conda --version
conda 4.3.25

$ gcc --version
gcc (GCC) 4.8.5
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61

$ ls /usr/local/cuda/lib64/libcudnn*
/usr/local/cuda/lib64/libcudnn.so  /usr/local/cuda/lib64/libcudnn.so.6  /usr/local/cuda/lib64/libcudnn.so.6.0.21  /usr/local/cuda/lib64/libcudnn_static.a

install fairseq-py

I want to install fairseq-py in my computer with no gpu,but when i execute 'python setup.py install',there is a error,so what should i do to avoid this error

linhanxiao@linhanxiao-K40ID:~/fairseq-py$ python setup.py install
running install
running bdist_egg
running egg_info
writing fairseq.egg-info/PKG-INFO
writing dependency_links to fairseq.egg-info/dependency_links.txt
writing requirements to fairseq.egg-info/requires.txt
writing top-level names to fairseq.egg-info/top_level.txt
reading manifest file 'fairseq.egg-info/SOURCES.txt'
writing manifest file 'fairseq.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
generating /tmp/tmp_2r9ssq4/_temporal_convolution_tbc.c
running build_ext
building '_temporal_convolution_tbc' extension
creating home
creating home/linhanxiao
creating home/linhanxiao/fairseq-py
creating home/linhanxiao/fairseq-py/fairseq
creating home/linhanxiao/fairseq-py/fairseq/clib
creating home/linhanxiao/fairseq-py/fairseq/clib/temporal_convolution_tbc
gcc -pthread -B /home/linhanxiao/miniconda3/compiler_compat -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home/linhanxiao/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include -I/home/linhanxiao/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/linhanxiao/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC -I/home/linhanxiao/miniconda3/include/python3.6m -c _temporal_convolution_tbc.c -o ./_temporal_convolution_tbc.o -std=c++11
cc1: warning: command line option ‘-std=c++11’ is valid for C++/ObjC++ but not for C [enabled by default]
In file included from /home/linhanxiao/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC/THC.h:4:0,
from _temporal_convolution_tbc.c:434:
/home/linhanxiao/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/../../lib/include/THC/THCGeneral.h:9:18: fatal error: cuda.h: 没有那个文件或目录
#include "cuda.h"
^
compilation terminated.
Traceback (most recent call last):
File "/home/linhanxiao/miniconda3/lib/python3.6/distutils/unixccompiler.py", line 118, in _compile
extra_postargs)
File "/home/linhanxiao/miniconda3/lib/python3.6/distutils/ccompiler.py", line 909, in spawn
spawn(cmd, dry_run=self.dry_run)
File "/home/linhanxiao/miniconda3/lib/python3.6/distutils/spawn.py", line 36, in spawn
_spawn_posix(cmd, search_path, dry_run=dry_run)
File "/home/linhanxiao/miniconda3/lib/python3.6/distutils/spawn.py", line 159, in _spawn_posix
% (cmd, exit_status))
distutils.errors.DistutilsExecError: command 'gcc' failed with exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/linhanxiao/miniconda3/lib/python3.6/site-packages/cffi/ffiplatform.py", line 49, in _build
dist.run_command('build_ext')
File "/home/linhanxiao/miniconda3/lib/python3.6/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "/home/linhanxiao/miniconda3/lib/python3.6/site-packages/setuptools/command/build_ext.py", line 75, in run
_build_ext.run(self)
File "/home/linhanxiao/miniconda3/lib/python3.6/distutils/command/build_ext.py", line 339, in run
self.build_extensions()
File "/home/linhanxiao/miniconda3/lib/python3.6/distutils/command/build_ext.py", line 448, in build_extensions
self._build_extensions_serial()
File "/home/linhanxiao/miniconda3/lib/python3.6/distutils/command/build_ext.py", line 473, in _build_extensions_serial
self.build_extension(ext)
File "/home/linhanxiao/miniconda3/lib/python3.6/site-packages/setuptools/command/build_ext.py", line 196, in build_extension
_build_ext.build_extension(self, ext)
File "/home/linhanxiao/miniconda3/lib/python3.6/distutils/command/build_ext.py", line 533, in build_extension
depends=ext.depends)
File "/home/linhanxiao/miniconda3/lib/python3.6/distutils/ccompiler.py", line 574, in compile
self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
File "/home/linhanxiao/miniconda3/lib/python3.6/distutils/unixccompiler.py", line 120, in _compile
raise CompileError(msg)
distutils.errors.CompileError: command 'gcc' failed with exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "setup.py", line 69, in
'build_py': build_py_hook,
File "/home/linhanxiao/miniconda3/lib/python3.6/distutils/core.py", line 148, in setup
dist.run_commands()
File "/home/linhanxiao/miniconda3/lib/python3.6/distutils/dist.py", line 955, in run_commands
self.run_command(cmd)
File "/home/linhanxiao/miniconda3/lib/python3.6/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "/home/linhanxiao/miniconda3/lib/python3.6/site-packages/setuptools/command/install.py", line 67, in run
self.do_egg_install()
File "/home/linhanxiao/miniconda3/lib/python3.6/site-packages/setuptools/command/install.py", line 109, in do_egg_install
self.run_command('bdist_egg')
File "/home/linhanxiao/miniconda3/lib/python3.6/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/home/linhanxiao/miniconda3/lib/python3.6/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "/home/linhanxiao/miniconda3/lib/python3.6/site-packages/setuptools/command/bdist_egg.py", line 169, in run
cmd = self.call_command('install_lib', warn_dir=0)
File "/home/linhanxiao/miniconda3/lib/python3.6/site-packages/setuptools/command/bdist_egg.py", line 155, in call_command
self.run_command(cmdname)
File "/home/linhanxiao/miniconda3/lib/python3.6/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/home/linhanxiao/miniconda3/lib/python3.6/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "/home/linhanxiao/miniconda3/lib/python3.6/site-packages/setuptools/command/install_lib.py", line 11, in run
self.build()
File "/home/linhanxiao/miniconda3/lib/python3.6/distutils/command/install_lib.py", line 105, in build
self.run_command('build_py')
File "/home/linhanxiao/miniconda3/lib/python3.6/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/home/linhanxiao/miniconda3/lib/python3.6/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "setup.py", line 49, in run
conv_tbc.build()
File "/home/linhanxiao/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/init.py", line 164, in build
_build_extension(ffi, cffi_wrapper_name, target_dir, verbose)
File "/home/linhanxiao/miniconda3/lib/python3.6/site-packages/torch/utils/ffi/init.py", line 100, in _build_extension
ffi.compile(tmpdir=tmpdir, verbose=verbose, target=libname)
File "/home/linhanxiao/miniconda3/lib/python3.6/site-packages/cffi/api.py", line 684, in compile
compiler_verbose=verbose, debug=debug, **kwds)
File "/home/linhanxiao/miniconda3/lib/python3.6/site-packages/cffi/recompiler.py", line 1484, in recompile
compiler_verbose, debug)
File "/home/linhanxiao/miniconda3/lib/python3.6/site-packages/cffi/ffiplatform.py", line 20, in compile
outputfilename = _build(tmpdir, ext, compiler_verbose, debug)
File "/home/linhanxiao/miniconda3/lib/python3.6/site-packages/cffi/ffiplatform.py", line 56, in _build
raise VerificationError('%s: %s' % (e.class.name, e))
cffi.error.VerificationError: CompileError: command 'gcc' failed with exit status 1

Typo in fairseq-py/preprocess.py

I think

parser.add_argument('--trainpref', metavar='FP', default='train', help='target language')

should be

parser.add_argument('--trainpref', metavar='FP', default='train', help='comma separated, train language prefixes' )

to be in line with

parser.add_argument('--validpref', metavar='FP', default='valid', help='comma separated, valid language prefixes')
parser.add_argument('--testpref', metavar='FP', default='test', help='comma separated, test language prefixes')

training and inference

how is the model working in the case of inference , i cant see how in the Fconv model the decoder is working for training and inference , in the paper also i am finding a confusion as to how am suppose to get inference of this.

Unable to restore pre-trained models during training

I ran this command:

python train.py data-bin/wmt14.en-fr.newstest2014/ --restore-file wmt14.en-fr.fconv-py/model.pt -s en -t fr --save-dir ./
and this gives the following error:

Traceback (most recent call last):
  File "train.py", line 261, in <module>
Process SpawnProcess-1:
    main()
  File "train.py", line 76, in main
Traceback (most recent call last):
  File "/mnt/disks/disk2/anaconda3/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
    extra_state = trainer.load_checkpoint(checkpoint_path)
  File "/mnt/disks/disk2/anaconda3/lib/python3.5/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/mnt/disks/disk2/fairseq-py/fairseq/multiprocessing_trainer.py", line 131, in load_checkpoint
  File "/mnt/disks/disk2/fairseq-py/fairseq/multiprocessing_event_loop.py", line 145, in _process_event_loop
    self.error_queue.put((rank, traceback.format_exc()))
KeyboardInterrupt
    for rank in range(self.num_replicas)
  File "/mnt/disks/disk2/fairseq-py/fairseq/multiprocessing_event_loop.py", line 162, in gen_list
    return [g.gen() for g in gens]
  File "/mnt/disks/disk2/fairseq-py/fairseq/multiprocessing_event_loop.py", line 162, in <listcomp>
    return [g.gen() for g in gens]
  File "/mnt/disks/disk2/fairseq-py/fairseq/multiprocessing_event_loop.py", line 158, in gen
    return next(self.generator)
  File "/mnt/disks/disk2/fairseq-py/fairseq/multiprocessing_event_loop.py", line 37, in result_generator
    yield self.return_pipes[rank].recv()
  File "/mnt/disks/disk2/anaconda3/lib/python3.5/multiprocessing/connection.py", line 250, in recv
    buf = self._recv_bytes()
  File "/mnt/disks/disk2/anaconda3/lib/python3.5/multiprocessing/connection.py", line 407, in _recv_bytes
    buf = self._recv(4)
  File "/mnt/disks/disk2/anaconda3/lib/python3.5/multiprocessing/connection.py", line 379, in _recv
    chunk = read(handle, remaining)
  File "/mnt/disks/disk2/fairseq-py/fairseq/multiprocessing_event_loop.py", line 91, in _signal_handler
    raise Exception(msg)
Exception: 

-- Tracebacks above this line can probably be ignored --

Traceback (most recent call last):
  File "/mnt/disks/disk2/anaconda3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 476, in load_state_dict
    own_state[name].copy_(param)
RuntimeError: invalid argument 2: sizes do not match at /mnt/disks/disk2/pytorch/aten/src/THC/THCTensorCopy.cu:31

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/mnt/disks/disk2/fairseq-py/fairseq/multiprocessing_event_loop.py", line 134, in _process_event_loop
    return_pipe.send(action_fn(rank, device_id, **kwargs))
  File "/mnt/disks/disk2/fairseq-py/fairseq/multiprocessing_trainer.py", line 139, in _async_load_checkpoint
    self.lr_scheduler, cuda_device=device_id)
  File "/mnt/disks/disk2/fairseq-py/fairseq/utils.py", line 83, in load_state
    model.load_state_dict(state['model'])
  File "/mnt/disks/disk2/anaconda3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 481, in load_state_dict
    .format(name, own_state[name].size(), param.size()))
RuntimeError: While copying the parameter named {}, whose dimensions in the model are {} and whose dimensions in the checkpoint are encoder.embed_tokens.weight.

I get this error when I try to restore the provided En-Fr model during train/validation. An error doesn't arise during testing. I am unable to find out the exact source of the problem and how to fix it. Please help me out with this. Thanks!

How to reproduce results on the summarization task?

Hi,
I am trying to reproduce experimental results on two text summarization tasks: DUC-2004 and Gigaword. Is it possible to provide an example of training a model on summarization? In addition, how can I set parameters so that ROUGE scores in the paper can be reproduced? Thanks.

Error for new string format syntax

You forget to revert this line.
Thank you.

  File "/home/playma/Research/fairseq-py/fairseq/utils.py", line 53
    raise ValueError(f'Unknown log format: {args.log_format}')

Error in fairseq/data.py

Excuse me. I got an error when I pull the new repository

Can anyone help me?

Traceback (most recent call last):
  File "pretrain_discriminator.py", line 11, in <module>
    from gan import trainer
  File "/home/playma/Research/fairseq-py/gan/trainer.py", line 1, in <module>
    from fairseq import data, utils
  File "/home/playma/Research/fairseq-py/fairseq/data.py", line 24
    if len(glob.glob(os.path.join(data_dir, f'{split}.*-*.*.bin'))) < 2:
                                                               ^
SyntaxError: invalid syntax

Log_softmax out of memory

I am so sorry for bothering you again,
The problem caused by that the lines in my dict.*.txt file is more than 500,000 . So it will out of memory when computing the log_softmax .
I have two 10G GPU.
Except for adding GPUs on my machine , is there other approaches to address it ?

Thank you!

stack smashing detected

When I try to run the training as described in the readme, I get the following error

*** stack smashing detected ***: /mnt/nfs/users/t-lamesc/Apps/anaconda3/envs/torch/bin/python terminated

Using ipdb, I tracked the error down to here.

When I remove CUDA_VISIBLE_DEVICES=0, i.e. when I try to train on all 8 GPUs, I get

python: malloc.c:3720: _int_malloc: Assertion `(unsigned long) (size) >= (unsigned long) (nb)' failed.
python: malloc.c:3720: _int_malloc: Assertion `(unsigned long) (size) >= (unsigned long) (nb)' failed.
python: malloc.c:3720: _int_malloc: Assertion `(unsigned long) (size) >= (unsigned long) (nb)' failed.
python: malloc.c:3720: _int_malloc: Assertion `(unsigned long) (size) >= (unsigned long) (nb)' failed.
python: malloc.c:3720: _int_malloc: Assertion `(unsigned long) (size) >= (unsigned long) (nb)' failed. 

instead. The tests all ran okay. Any ideas?

Out of memory error when train from a state checkpoint

Since the recent updates, it usually requires a 10% more peak memory usage at the point when the program just finished the optimization history loading and starting the remaining train work. Could this be caused by any additional info that's been added to the loaded state or an issue of garbage collection?

This frequently happens when loading from an unfinished checkpoint. And I didn't see it happen before the recent PR33

How can I rollout step by step

Can I reuse the code in sequence_generator.py to rollout the target step by step for RL?

I want to reproduce the code in "Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets"

screen shot 2017-10-30 at 2 37 41 pm

Architecture Clarification

Hi,

It might be a bit dumb on my part but I am new to pyTorch. I am having troubles in understanding the network architecture. Is there any visualization of the network architecture?

Based on my understanding of the code I made a CAFFE network of the encoder. Can somebody please validate it? I believe the decoder would be something inline to the encoder. The following is a graphical visualization of the encoder network.

The caffe_conv_encoder_visualisation

Ignore the AlexNet part that is there as I am trying to find a sequence in an image

The network after pool5 is my encoder. In essence, the pool5 features would be analogous to the word embeddings that you guys use.

Any help would be appreciated :)

Cheers
Harsh Agarwal

confused about the gif on readme

I am reading the paper 'Convolutional Sequence to Sequence Learning' and I am really impressed about your idea of convSeq2Seq. Thus I found this project page.
I get confused about the gif on Introduction part of readme.
In the paper Figure 1
image
It says
image
So I think the four conv blocks in Layer 1 should compute simultaneously and for generation part also should generate four elements simultaneously.
While in the gif,
it shows one by one, not simultaneously.

preprocessing.py syntax error

Hi, I'm training my own model and ran into the following error:

File "preprocess.py", line 115
print('{} {}'.format(src_dict[k], tgt_dict[v]), file=f)
^
SyntaxError: invalid syntax
(the carrot's at the file=f part)

I decided to make sure it wasn't just my prepare-X.sh scripts, by going through the steps listed under Training a New Model, and the same error is produced after entering the following:

python preprocess.py --source-lang de --target-lang en --trainpref $TEXT/train --validpref $TEXT/valid --testpref $TEXT/test --thresholdtgt 3 --thresholdsrc 3 --destdir data-bin/iwslt14.tokenized.de-en

Is there something I'm missing? As far as I can tell, source and target dictionaries are being properly built, so I don't know exactly what the error is hinting at.

TypeError: 'NoneType' object is not iterable

Namespace(batch_size=32, beam=5, cpu=False, data='/home/robert_tien/work/pytorch/models/wmt14.en-fr.fconv-py', gen_subset='test', interactive=True, lenpen=1, log_interval=1000, max_len_a=0, max_len_b=200, max_positions=1024, nbest=1, no_beamable_mm=False, no_early_stop=False, no_progress_bar=False, path=['/home/robert_tien/work/pytorch/models/wmt14.en-fr.fconv-py/model.pt'], quiet=False, remove_bpe=None, seed=1, skip_invalid_size_inputs_valid_test=False, source_lang=None, target_lang=None, unk_replace_dict='', unnormalized=False, workers=1)
Traceback (most recent call last):
File "generate.py", line 161, in
main()
File "generate.py", line 41, in main
dataset = data.load_with_check(args.data, [args.gen_subset], args.source_lang, args.target_lang)
File "/home/xushiting/workspace/pytorch/fairseq-py/fairseq/data.py", line 37, in load_with_check
src, dst = find_language_pair(os.listdir(path))
TypeError: 'NoneType' object is not iterable

RuntimeError: invalid argument 5: k not in range for dimension

Not sure what's going on but I've got following result:

epoch 024 | train loss 0.15 | train ppl 1.11 | s/checkpoint 1296 | words/s 7296 | words/batch 599 | bsz 299 | lr 0.000025 | clip 0% | gnorm 0.0389
| epoch 024 | valid on 'valid' subset | valid loss 0.14 | valid ppl 1.10
| done training in 34402.3 seconds
| Test on test with beam=1: BLEU4 = 0.00, 92.3/0.0/0.0/0.0 (BP=1.000, ratio=1.000, syslen=590893, reflen=590893)
| Test on test with beam=5: BLEU4 = 0.00, 92.3/0.0/0.0/0.0 (BP=1.000, ratio=1.000, syslen=590893, reflen=590893)
Traceback (most recent call last):
File "train.py", line 210, in
main()
File "train.py", line 107, in main
cuda_device=(0 if num_gpus > 0 else None))
File "train.py", line 204, in score_test
for _, _, ref, hypos in translator.generate_batched_itr(itr, cuda_device=cuda_device):
File "/home/ada/fairseq-py/fairseq/sequence_generator.py", line 72, in generate_batched_itr
maxlen=(maxlen_a*srclen + maxlen_b))
File "/home/ada/fairseq-py/fairseq/sequence_generator.py", line 86, in generate
return self._generate(src_tokens, src_positions, beam_size, maxlen)
File "/home/ada/fairseq-py/fairseq/sequence_generator.py", line 232, in _generate
probs.view(bsz, -1).topk(cand_size, out=(cand_scores, cand_indices))
RuntimeError: invalid argument 5: k not in range for dimension at /home/ada/pytorch/torch/lib/THC/generic/THCTensorTopK.cu:21

Checking size attribute of dst when dst is None

In the code below if dst is None the dst.sizes[idx] block in Exception ... will throw an unhandled error.

This is around https://github.com/facebookresearch/fairseq-py/blob/master/fairseq/data.py#L222

for idx in indices:
        # - 2 here stems from make_positions() where we offset positions
        # by padding_value + 1
        if src.sizes[idx] < 2 or \
                (dst is not None and dst.sizes[idx] < 2) or \
                sizes[idx] > max_positions - 2:
            raise Exception("Unable to handle input id {} of "
                            "size {} / {}.".format(idx, src.sizes[idx], dst.sizes[idx]))

To fix this (dst is not None and dst.sizes[idx] < 2) can be modified to (False if dst is None else dst.sizes[idx] < 2)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.