Git Product home page Git Product logo

opennmt-tf's Introduction

CI codecov PyPI version Documentation Gitter Forum

OpenNMT-tf

OpenNMT-tf is a general purpose sequence learning toolkit using TensorFlow 2. While neural machine translation is the main target task, it has been designed to more generally support:

  • sequence to sequence mapping
  • sequence tagging
  • sequence classification
  • language modeling

The project is production-oriented and comes with backward compatibility guarantees.

Key features

Modular model architecture

Models are described with code to allow training custom architectures and overriding default behavior. For example, the following instance defines a sequence to sequence model with 2 concatenated input features, a self-attentional encoder, and an attentional RNN decoder sharing its input and output embeddings:

opennmt.models.SequenceToSequence(
    source_inputter=opennmt.inputters.ParallelInputter(
        [
            opennmt.inputters.WordEmbedder(embedding_size=256),
            opennmt.inputters.WordEmbedder(embedding_size=256),
        ],
        reducer=opennmt.layers.ConcatReducer(axis=-1),
    ),
    target_inputter=opennmt.inputters.WordEmbedder(embedding_size=512),
    encoder=opennmt.encoders.SelfAttentionEncoder(num_layers=6),
    decoder=opennmt.decoders.AttentionalRNNDecoder(
        num_layers=4,
        num_units=512,
        attention_mechanism_class=tfa.seq2seq.LuongAttention,
    ),
    share_embeddings=opennmt.models.EmbeddingsSharingLevel.TARGET,
)

The opennmt package exposes other building blocks that can be used to design:

Standard models such as the Transformer are defined in a model catalog and can be used without additional configuration.

Find more information about model configuration in the documentation.

Full TensorFlow 2 integration

OpenNMT-tf is fully integrated in the TensorFlow 2 ecosystem:

Compatibility with CTranslate2

CTranslate2 is an optimized inference engine for OpenNMT models featuring fast CPU and GPU execution, model quantization, parallel translations, dynamic memory usage, interactive decoding, and more! OpenNMT-tf can automatically export models to be used in CTranslate2.

Dynamic data pipeline

OpenNMT-tf does not require to compile the data before the training. Instead, it can directly read text files and preprocess the data when needed by the training. This allows on-the-fly tokenization and data augmentation by injecting random noise.

Model fine-tuning

OpenNMT-tf supports model fine-tuning workflows:

  • Model weights can be transferred to new word vocabularies, e.g. to inject domain terminology before fine-tuning on in-domain data
  • Contrastive learning to reduce word omission errors

Source-target alignment

Sequence to sequence models can be trained with guided alignment and alignment information are returned as part of the translation API.


OpenNMT-tf also implements most of the techniques commonly used to train and evaluate sequence models, such as:

  • automatic evaluation during the training
  • multiple decoding strategy: greedy search, beam search, random sampling
  • N-best rescoring
  • gradient accumulation
  • scheduled sampling
  • checkpoint averaging
  • ... and more!

See the documentation to learn how to use these features.

Usage

OpenNMT-tf requires:

  • Python 3.7 or above
  • TensorFlow 2.6, 2.7, 2.8, 2.9, 2.10, 2.11, 2.12, or 2.13

We recommend installing it with pip:

pip install --upgrade pip
pip install OpenNMT-tf

See the documentation for more information.

Command line

OpenNMT-tf comes with several command line utilities to prepare data, train, and evaluate models.

For all tasks involving a model execution, OpenNMT-tf uses a unique entrypoint: onmt-main. A typical OpenNMT-tf run consists of 3 elements:

  • the model type
  • the parameters described in a YAML file
  • the run type such as train, eval, infer, export, score, average_checkpoints, or update_vocab

that are passed to the main script:

onmt-main --model_type <model> --config <config_file.yml> --auto_config <run_type> <run_options>

For more information and examples on how to use OpenNMT-tf, please visit our documentation.

Library

OpenNMT-tf also exposes well-defined and stable APIs, from high-level training utilities to low-level model layers and dataset transformations.

For example, the Runner class can be used to train and evaluate models with few lines of code:

import opennmt

config = {
    "model_dir": "/data/wmt-ende/checkpoints/",
    "data": {
        "source_vocabulary": "/data/wmt-ende/joint-vocab.txt",
        "target_vocabulary": "/data/wmt-ende/joint-vocab.txt",
        "train_features_file": "/data/wmt-ende/train.en",
        "train_labels_file": "/data/wmt-ende/train.de",
        "eval_features_file": "/data/wmt-ende/valid.en",
        "eval_labels_file": "/data/wmt-ende/valid.de",
    }
}

model = opennmt.models.TransformerBase()
runner = opennmt.Runner(model, config, auto_config=True)
runner.train(num_devices=2, with_eval=True)

Here is another example using OpenNMT-tf to run efficient beam search with a self-attentional decoder:

decoder = opennmt.decoders.SelfAttentionDecoder(num_layers=6, vocab_size=32000)

initial_state = decoder.initial_state(
    memory=memory, memory_sequence_length=memory_sequence_length
)

batch_size = tf.shape(memory)[0]
start_ids = tf.fill([batch_size], opennmt.START_OF_SENTENCE_ID)

decoding_result = decoder.dynamic_decode(
    target_embedding,
    start_ids=start_ids,
    initial_state=initial_state,
    decoding_strategy=opennmt.utils.BeamSearch(4),
)

More examples using OpenNMT-tf as a library can be found online:

  • The directory examples/library contains additional examples that use OpenNMT-tf as a library
  • nmt-wizard-docker uses the high-level opennmt.Runner API to wrap OpenNMT-tf with a custom interface for training, translating, and serving

For a complete overview of the APIs, see the package documentation.

Additional resources

opennmt-tf's People

Contributors

alexisdoualle avatar alucardpj avatar askender avatar atebbifakhr avatar awery avatar cholakov avatar cservan avatar dblandan avatar gcervantes8 avatar guillaumekln avatar jordimas avatar jsenellart avatar leod avatar mantili avatar monkeyhippies avatar nilboy avatar nslatysheva avatar ppetrushkov avatar singhay avatar steremma avatar tokestermw avatar vince62s avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

opennmt-tf's Issues

Ensure proper dataset shuffling

When eval_delay is shorter than the training time of 1 epoch and sample_buffer_size is smaller than the full dataset size, the training will at most see:

(steps_per_second * eval_delay * batch_size) + sample_buffer_size

training example and discarding the rest. It can result in over-fitting or reduced performance.

Currently a warning is pushed when the user did not set sample_buffer_size but a proper shuffling mechanism should be implemented independent of the values of these 2 options.

Meaning of loss during training

Hello, I want to know if loss printed on the screen during training a seq2seq model is the same with the Perpleixty in OPENNMT-lua. I found the loss changes very significantly during training using OpenNMT-tf while the Perplexity in OpenNMT-lua (the same model with the same setting including learning rate and optimizer) decreases almost monotonously during training.

I listed a snippet of my training log here:

INFO:tensorflow:loss = 48.18139, step = 1001 (46.707 sec)
INFO:tensorflow:global_step/sec: 2.09447
INFO:tensorflow:words_per_sec/features:0: 3275.1
INFO:tensorflow:words_per_sec/labels:0: 3610.39
INFO:tensorflow:global_step/sec: 2.13073
INFO:tensorflow:words_per_sec/features:0: 2945.64
INFO:tensorflow:words_per_sec/labels:0: 3282.78
INFO:tensorflow:loss = 61.09418, step = 1101 (47.340 sec)
INFO:tensorflow:global_step/sec: 2.08646
INFO:tensorflow:words_per_sec/features:0: 3127.07
INFO:tensorflow:words_per_sec/labels:0: 3464.33
INFO:tensorflow:global_step/sec: 2.14416
INFO:tensorflow:words_per_sec/features:0: 3226.2
INFO:tensorflow:words_per_sec/labels:0: 3564.5
INFO:tensorflow:loss = 110.21961, step = 1201 (47.280 sec)
INFO:tensorflow:global_step/sec: 2.23839
INFO:tensorflow:words_per_sec/features:0: 3106.01
INFO:tensorflow:words_per_sec/labels:0: 3462.01
INFO:tensorflow:global_step/sec: 2.2783
INFO:tensorflow:words_per_sec/features:0: 3460.2
INFO:tensorflow:words_per_sec/labels:0: 3825.8
INFO:tensorflow:loss = 9.316487, step = 1301 (44.303 sec)

Thanks in advance.

"The graph couldn't be sorted in topological order" error when use gpu-build tensorflow serving

when I use tensorflow_model_server without gpu, it's ok. However, when using tensorflow_model_server build with gpu, it loads the model success, but error in processing the request.

tensorflow_model_server log:

2017-12-05 07:56:06.617010: I external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: 375.66.0
2017-12-05 07:56:06.617028: I external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_diagnostics.cc:300] kernel version seems to match DSO: 375.66.0
2017-12-05 07:56:06.682749: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:155] Restoring SavedModel bundle.

2017-12-05 07:56:07.306172: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:190] Running LegacyInitOp on SavedModel bundle.
2017-12-05 07:56:07.419016: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:284] Loading SavedModel: success. Took 845041 microseconds.
2017-12-05 07:56:07.419215: I tensorflow_serving/core/loader_harness.cc:86] Successfully loaded servable version {name: tech_90 version: 1512156186}
2017-12-05 07:56:07.422146: I tensorflow_serving/model_servers/main.cc:288] Running ModelServer at 0.0.0.0:50145 ...
2017-12-05 07:56:16.370747: E external/org_tensorflow/tensorflow/core/grappler/utils/topological_sort.cc:75] The graph couldn't be sorted in topological order.

client log:

2017-12-05 15:30:10 _server.py[line:381] ERROR Exception calling application: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="Incomplete graph, missing 1 inputs for seq2seq/decoder_1/decoder_1/GatherTree")
Traceback (most recent call last):
  File "/Users/xxx/anaconda/envs/py2/lib/python2.7/site-packages/grpc/_server.py", line 376, in _call_behavior
    return behavior(argument, context), True
  File "/Users/xxx/PycharmProjects/text-correction-rpc/src/diagnose_server.py", line 104, in doTextDiagnose
    src_words, tag_words, score = self.doctor.diagnose(check_sent)
  File "/Users/xxx/PycharmProjects/text-correction-rpc/src/core/doctor.py", line 99, in diagnose
    translate_res = translate_res.result()
  File "/Users/xxx/anaconda/envs/py2/lib/python2.7/site-packages/grpc/beta/_client_adaptations.py", line 97, in result
    raise _abortion_error(rpc_error_call)
AbortionError: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="Incomplete graph, missing 1 inputs for seq2seq/decoder_1/decoder_1/GatherTree")

Get internal features from NMT

For some tasks like APE or QE it might be useful to retrieve internal metrics of NMT, given that we already have source and target.

To accomplish this one would need:

  1. Enable "teacher forcing" during inference time
  2. Allow extracting per-word probabilities of translated text
  3. Allow extracting attention vectors, so that we could have word alignment between source and target

The output layer of the seq2seq model is not shared between training and evaluation? Neither is it shared with the word embedding.

In the https://github.com/OpenNMT/OpenNMT-tf/blob/master/opennmt/models/sequence_to_sequence.py. The line 156 calls the decode method to decode the next step logits, without passing the parameter output_layer. While in the L181 and L194, the dynamic_decode and dynamic_decode_and search methods are called without `output_layer' parameters. So the output layer is not shared between training and evaluation/testing. Is it a bug or intended? Could someone give some explanation?

Another problem is that in the build_output_layer', it calls tf.layers.Dense' to project the input into logits, which cannot share transform parameters with the encoding word embeddings.

Problem in quickstart example

I am following the toy example in the Quickstart section, and receive:

TypeError: __init__() got an unexpected keyword argument 'model_dir'

at main.py line 256 when trying to train the model in step 2.

[Multi-Task] Word Feature Output in Target Side

OpenNMT Lua version supports extra word features concatenated with token with "๏ฝœ" delimiter and the decoder can also make predictions of word features.
I'm wondering if Tensorflow version has this functionality?

Question about BLEU evaluation

This project is extremely useful to me, thanks for your great work.
I use external evaluator BLEU and BLEU-detok in opennmt-tf to evaluate our seq2seq model. The returned result is 2.21/1.96 for BLEU/BLEU-detok. However, when I use another popular BLEU script for evaluation (https://github.com/xinyadu/nqg/blob/master/qgevalcap/bleu/bleu_scorer.py), the returned results are Bleu1: 0.263 Bleu2: 0.10017 Bleu3: 0.04796 Bleu4: 0.026. I can not figure out why the BLEU in opennmt is much less than BLEU in https://github.com/xinyadu/nqg/blob/master/qgevalcap/bleu/bleu_scorer.py

OpenNMT-tf inference output option

Hello. I wonder if I can output the attention matrix during inference because I want to replace unk tokens with the highest attention token in the source. Or, is there any other way to deal with unk tokens?

Thank you.

Self-attention decoding is slow

Self-attention decoding is slow by design. However, some tricks were implemented in Tensor2Tensor to make the decoding faster. We should learn about those tricks and implement similar optimizations.

Transformer baseline-1M results

To keep track of my tests:
Baseline-1M En to Fr.

Test1: batch of 64 sentences
Multibleu on test set: 35.56 after 70K steps // 36.14 after 100K steps.
same on Newstest2014: 27.59 // 28.01

Test2: batch of 4096 tokens (which about more than twice the size of 64 sentences)
Multibleu on test set after 100K steps: 36.88
Newstest2014: 27.98

Comparison with T2T:
After 70K steps, on 64 sentence batch 34.21 // After 100K steps 34.76
After 70K steps on token batch 4096 37.44 // newstest2014 28.85

Serving Error: Load Model

I am going to load the exported model for serving, however I can't.
For reporting the url here and for safety reason I replaced the exact location with "something".

tensorflow_model_server --port=9000 --enable_batching=true --batching_
parameters_file='something/server/batching_parameters.txt' --model_name=seq2seq --model_base_path='something/server/models/export/latest/'

What is this error about and how can I solve it?

2018-01-24 15:24:17.752914: I tensorflow_serving/model_servers/main.cc:147] Building single TensorFlow model file config: model_name: seq2seq model_base_path: something/server/models/export/latest/
2018-01-24 15:24:17.753401: I tensorflow_serving/model_servers/server_core.cc:441] Adding/updating models.
2018-01-24 15:24:17.753431: I tensorflow_serving/model_servers/server_core.cc:492] (Re-)adding model: seq2seq
2018-01-24 15:24:17.853944: I tensorflow_serving/core/basic_manager.cc:705] Successfully reserved resources to load servable {name: seq2seq version: 1516653311}
2018-01-24 15:24:17.853999: I tensorflow_serving/core/loader_harness.cc:66] Approving load for servable version {name: seq2seq version: 1516653311}
2018-01-24 15:24:17.854030: I tensorflow_serving/core/loader_harness.cc:74] Loading servable version {name: seq2seq version: 1516653311}
2018-01-24 15:24:17.854070: I external/org_tensorflow/tensorflow/contrib/session_bundle/bundle_shim.cc:360] Attempting to load native SavedModelBundle in bundle-shim from: something/server/models/export/latest/1516653311
2018-01-24 15:24:17.854097: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:236] Loading SavedModel from: something/server/models/export/latest/1516653311
2018-01-24 15:24:17.956789: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:284] Loading SavedModel: fail. Took 102579 microseconds.
2018-01-24 15:24:17.956847: E tensorflow_serving/util/retrier.cc:38] Loading servable: {name: seq2seq version: 1516653311} failed: Invalid argument: Input 3 of node seq2seq/decoder_1/decoder_1/GatherTree was passed int32 from seq2seq/decoder_1/end_token:0 incompatible with expected INVALID.

loss waves too much

hi, i'm using transformer model to train my chinese to english model,
the config is below
after training for 140K batches, the loss still changes very much, and the value of loss looks like too big for other tasks.
any ideas? thanks.
image

3 params:
4 optimizer: GradientDescentOptimizer
5 learning_rate: 0.1
6 param_init: 0.1
7 clip_gradients: 5.0
8 decay_type: exponential_decay
9 decay_rate: 0.7
10 decay_steps: 5000
11 start_decay_steps: 150000
12 beam_width: 4
13 maximum_iterations: 250
14 external_evaluators: BLEU
15 train:
16 batch_size: 100
17 save_checkpoints_steps: 5000
18 save_summary_steps: 50
19 train_steps: 1000000
20 eval_delay: 7200 # Every 5 hours.
21 maximum_features_length: 50
22 maximum_labels_length: 50
23 keep_checkpoint_max: 200
24 save_eval_predictions: True
25 infer:
26 batch_size: 30

AttributeError: 'str' object has no attribute 'update'

Traceback (most recent call last):
  File "/home/soul/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/home/soul/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/soul/projects/opennmt-tf/OpenNMT-tf/bin/main.py", line 275, in <module>
    main()
  File "/home/soul/projects/opennmt-tf/OpenNMT-tf/bin/main.py", line 225, in main
    config = load_config(args.config)
  File "opennmt/config.py", line 48, in load_config
    config[section].update(subconfig[section])
AttributeError: 'str' object has no attribute 'update'

The attribute that caused it was "model_dir", where its value was a string.

The config file that I used:

# The directory where models and summaries will be saved. It is created if it does not exist.
model_dir: enfr

data:
  train_features_file: data/enfr/src-train.txt
  train_labels_file: data/enfr/tgt-train.txt
  eval_features_file: data/enfr/src-val.txt
  eval_labels_file: data/enfr/tgt-val.txt

  # (optional) Models may require additional resource files (e.g. vocabularies).
  source_words_vocabulary: data/enfr/src-vocab.txt
  target_words_vocabulary: data/enfr/tgt-vocab.txt

# Model and optimization parameters.
params:
  # The optimizer class name in tf.train or tf.contrib.opt.
  optimizer: AdamOptimizer
  learning_rate: 0.1

  # (optional) Maximum gradients norm (default: None).
  clip_gradients: 5.0
  # (optional) The type of learning rate decay (default: None). See:
  #  * https://www.tensorflow.org/versions/master/api_guides/python/train#Decaying_the_learning_rate
  #  * opennmt/utils/decay.py
  # This value may change the semantics of other decay options. See the documentation or the code.
  decay_type: exponential_decay
  # (optional unless decay_type is set) The learning rate decay rate.
  decay_rate: 0.9
  # (optional unless decay_type is set) Decay every this many steps.
  decay_steps: 10000
  # (optional) If true, the learning rate is decayed in a staircase fashion (default: True).
  staircase: true
  # (optional) After how many steps to start the decay (default: 0).
  start_decay_steps: 50000
  # (optional) Stop decay when this learning rate value is reached (default: 0).
  minimum_learning_rate: 0.0001
  # (optional) Width of the beam search (default: 1).
  beam_width: 5
  # (optional) Length penaly weight to apply on hypotheses (default: 0).
  length_penalty: 0.2
  # (optional) Maximum decoding iterations before stopping (default: 250).
  maximum_iterations: 200

# Training options.
train:
  batch_size: 64

  # (optional) Save a checkpoint every this many steps.
  save_checkpoints_steps: 5000
  # (optional) How many checkpoints to keep on disk.
  keep_checkpoint_max: 3
  # (optional) Save summaries every this many steps.
  save_summary_steps: 100
  # (optional) Train for this many steps. If not set, train forever.
  train_steps: 1000000
  # (optional) Evaluate every this many seconds (default: 3600).
  eval_delay: 7200
  # (optional) Save evaluation predictions in model_dir/eval/.
  save_eval_predictions: false
  # (optional) The maximum length of feature sequences during training (default: None).
  maximum_features_length: 70
  # (optional) The maximum length of label sequences during training (default: None).
  maximum_labels_length: 70
  # (optional) The number of buckets by sequence length to improve training efficiency (default: 5).
  num_buckets: 5
  # (optional) The number of threads to use for processing data in parallel (default: number of logical cores).
  num_parallel_process_calls: 4
  # (optional) The data pre-fetch buffer size, e.g. for shuffling examples (default: batch_size * 1000).
  buffer_size: 10000

# (optional) Inference options.
infer:
  # (optional) The batch size to use (default: 1).
  batch_size: 10
  # (optional) The number of threads to use for processing data in parallel (default: number of logical cores).
  num_parallel_process_calls: 8
  # (optional) The data pre-fetch buffer size when processing data in parallel (default: batch_size * 10).
  buffer_size: 100
  # (optional) For compatible models, the number of hypotheses to output (default: 1).
  n_best: 1

Performance (accuracy) of a Transformer model is somewhat below expectations

I trained several models for En <-> Fr on WMT14 data, and I get around 30 BLEU with these models. While Google's paper says they achieved ~42 BLEU. I'm aware that such high BLEU is dependent on batch-size/LR/number of GPUs for training, but my expectations were still to get somewhere around 38 BLEU.

So here is my set-up:

  1. All the EN <-> Fr data from WMT 14 concatenated together and shuffled. It's about 40 mln sentence pairs.
  2. Learnt BPE of size 32k
  3. Src and Tgt dictionaries are both of size 32k tokens.
  4. All the parameters are from default config
  5. ~315k steps with batch-size of 128 (maximum size that fits into 1080ti RAM)

P.S. When I evaluate same model on newstest2013 - I get ~28BLEU

Do you have any insights on what might be imporved here, or if there are some significant differences with T2T implementation?

Add a proper corpus tokenizer

While working on the big-corpus MT training I noticed, that current implementation lacks some proper tokenization script. Currently available script only makes splits on whitespaces. So I wrote one for myself based on Spacy library. I'm not sure what are your thoughts about 3rd-party tools, but if you're ok with it, I can make a pull-request.

"""Standalone script to tokenize a corpus based on Spacy NLP library."""

from __future__ import print_function

import argparse
import sys
import spacy

reload(sys)
sys.setdefaultencoding('utf-8')


def main():

  parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)
  parser.add_argument(
      "--lang", default="en",
      help="Language of your text.")
  parser.add_argument(
      "--delimiter", default=" ",
      help="Token delimiter for text serialization.")
  args = parser.parse_args()


  nlp = spacy.load(args.lang, disable=['parser', 'tagger', 'ner'])

  lines = []
  for line in sys.stdin:
    line = line.strip().decode("utf-8")

    tokens = nlp(line, disable=['parser', 'tagger', 'ner'])
    merged_tokens = args.delimiter.join([str(token) for token in tokens])
    print(merged_tokens)


if __name__ == "__main__":
  main()

The usage:
python -m bin.tokenize_text_spacy < data/PathTo/giga-fren.release2.fixed.en > data/PathTo/giga-fren.release2.token.en

Performance:
~1.1GB of text per hour OR ~6-8mln of sentences per hour
22.5mln sentences for En<->Fr where processed in ~3hrs for Eng and 4.5hrs fo French

FP16 support

This issue will track FP16 support in the project.

Predefined FP16 models are available for testing in config/models/*_fp16.py. Initial tests show poor efficiency and convergence issues with optimizers like Adam. However, this might evolve in the future so we will should maintain configurable dtype for testing and tinkering.

Experiments and feedback from people using Tesla V100 with the latest CUDA and TensorFlow versions are very welcome.

latest code error: Tensor conversion requested dtype float32 for Tensor with dtype int32

when I use the latest commit 0f43259 to train my model, I got these error messages

2018-02-08 16:06:31.254817: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2
2018-02-08 16:06:32.060998: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:895] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-02-08 16:06:32.061448: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 0 with properties:
name: GeForce GTX 980 Ti major: 5 minor: 2 memoryClockRate(GHz): 1.114
pciBusID: 0000:04:00.0
totalMemory: 5.93GiB freeMemory: 5.83GiB
2018-02-08 16:06:32.061477: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 980 Ti, pci bus id: 0000:04:00.0, compute capability: 5.2)
INFO:tensorflow:Using config: {'_task_type': 'worker', '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fa23ba12208>, '_tf_random_seed': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_save_summary_steps': 50, '_save_checkpoints_steps': 1000, '_num_worker_replicas': 1, '_is_chief': True, '_save_checkpoints_secs': None, '_session_config': gpu_options {
}
allow_soft_placement: true
, '_task_id': 0, '_num_ps_replicas': 0, '_service': None, '_log_step_count_steps': 50, '_model_dir': 'models/seq-tagging-020816', '_master': ''}
INFO:tensorflow:Running training and evaluation locally (non-distributed).
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after 300 secs (eval_spec.throttle_secs) or training is finished.
Traceback (most recent call last):
File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/ssdDisk/code/opennmt-tf-0110/bin/main.py", line 335, in
main()
File "/ssdDisk/code/opennmt-tf-0110/bin/main.py", line 317, in main
train(estimator, model, config, num_devices=args.num_gpus)
File "/ssdDisk/code/opennmt-tf-0110/bin/main.py", line 153, in train
tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
File "/workHome/venv/py3-tf15/lib/python3.5/site-packages/tensorflow/python/estimator/training.py", line 432, in train_and_evaluate
executor.run_local()
File "/workHome/venv/py3-tf15/lib/python3.5/site-packages/tensorflow/python/estimator/training.py", line 611, in run_local
hooks=train_hooks)
File "/workHome/venv/py3-tf15/lib/python3.5/site-packages/tensorflow/python/estimator/estimator.py", line 314, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/workHome/venv/py3-tf15/lib/python3.5/site-packages/tensorflow/python/estimator/estimator.py", line 743, in _train_model
features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
File "/workHome/venv/py3-tf15/lib/python3.5/site-packages/tensorflow/python/estimator/estimator.py", line 725, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "/ssdDisk/code/opennmt-tf-0110/opennmt/models/model.py", line 78, in _model_fn
loss = _extract_loss(losses_shards)
File "/ssdDisk/code/opennmt-tf-0110/opennmt/models/model.py", line 61, in _extract_loss
actual_loss = _normalize_loss(loss[0], den=loss[1])
File "/ssdDisk/code/opennmt-tf-0110/opennmt/models/model.py", line 47, in _normalize_loss
return tf.add_n(num) / tf.add_n(den)
File "/workHome/venv/py3-tf15/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 898, in binary_op_wrapper
y = ops.convert_to_tensor(y, dtype=x.dtype.base_dtype, name="y")
File "/workHome/venv/py3-tf15/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 932, in convert_to_tensor
as_ref=False)
File "/workHome/venv/py3-tf15/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1022, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/workHome/venv/py3-tf15/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 866, in _TensorTensorConversionFunction
(dtype.name, t.dtype.name, str(t)))
ValueError: Tensor conversion requested dtype float32 for Tensor with dtype int32: 'Tensor("seqtagger/parallel_0/seqtagger/strided_slice:0", shape=(), dtype=int32, device=/device:GPU:0)'

but if I reset the commit to cb66148, everything is OK.
Is there something wrong during the two commits?

In-graph replication for RNNs

Just a reminder to keep track

#59
works for the Transformer but not RNN

BUT distributed training works. (less convenient to set up)

Add BLEU evaluation metric

Ideally, this metric should be compatible with tf.metrics for a seamless integration in the training flow. Otherwise, we could rely on the opennmt.utils.hook.SaveEvaluationPredictionHook hook for external evaluation.

Installation via pip

A direct pip install of the package,

pip install git+git://github.com/OpenNMT/[email protected]

gives an error:

Python 3.6.4 (default, Mar  8 2018, 13:06:21)
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import opennmt
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/site-packages/opennmt/__init__.py", line 3, in <module>
    from opennmt import decoders
ImportError: cannot import name 'decoders'

I think the issue is in setup.py, the packages settings is set to ["opennmt"] instead of letting find_packages() find Python modules. I also noticed the find_packages() was removed in this commit.

70ffde8#diff-2eeaed663bd0d25b7e608891384b7298

Is there any reasons for this?

Thanks

error when serving seq_tagger model

I trained a seq_tagger model and serve it using tensorflow serving, but when I test the serving model, I got the error message:

grpc.framework.interfaces.face.face.AbortionError: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="Placeholder_3:0 is both fed and fetched.")

the model detailed info is below:

The given SavedModel SignatureDef contains the following input(s):
inputs['chars'] tensor_info:
dtype: DT_STRING
shape: (-1, -1, -1)
name: Placeholder_2:0
inputs['length'] tensor_info:
dtype: DT_INT32
shape: (-1)
name: Placeholder_3:0
inputs['tokens'] tensor_info:
dtype: DT_STRING
shape: (-1, -1)
name: Placeholder:0
The given SavedModel SignatureDef contains the following output(s):
outputs['length'] tensor_info:
dtype: DT_INT32
shape: (-1)
name: Placeholder_3:0
outputs['tags'] tensor_info:
dtype: DT_STRING
shape: (-1, -1)
name: seqtagger/index_to_string_Lookup:0
Method name is: tensorflow/serving/predict

it shows that the inputs['length'] and the output['length'] are the same placeholder, so it raise error.

How to use seq_tagger.py for sequence labeling

This project is the best seq2seq implementation of tf version I have ever seen.
Could you provide any running examples for seq_tagger.py? e.g. it's input and output format, how to evaluate it, and so on. Since sequence labeling is very different with seq2seq tasks.

FP16: ValueError: Tensor conversion requested dtype float32 for Tensor with dtype float16:

Hi,
when trying to run FP16 model on Titan V & Tensorflow 1.6rc1 I get following (both master and tf-1.6 branch):
CUDA_VISIBLE_DEVICES=1 python -m bin.main train --model config/models/nmt_ari_medium_fp16.py --config /data/a1/config/ari-defaults.yml /data/a1/config/cfg-anon16.yml

/usr/local/lib/python2.7/dist-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Use the retry module or similar alternatives.
2018-03-09 07:05:16.935002: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties: 
name: Graphics Device major: 7 minor: 0 memoryClockRate(GHz): 1.455
pciBusID: 0000:04:00.0
totalMemory: 11.78GiB freeMemory: 8.53GiB
2018-03-09 07:05:16.935032: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0
2018-03-09 07:05:17.738979: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-03-09 07:05:17.739003: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917]      0 
2018-03-09 07:05:17.739009: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0:   N 
2018-03-09 07:05:17.739214: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/device:GPU:0 with 8225 MB memory) -> physical GPU (device: 0, name: Graphics Device, pci bus id: 0000:04:00.0, compute capability: 7.0)
INFO:tensorflow:Using config: {'_save_checkpoints_secs': None, '_session_config': gpu_options {
}
allow_soft_placement: true
, '_keep_checkpoint_max': 5, '_task_type': 'worker', '_global_id_in_cluster': 0, '_is_chief': True, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f62269c2950>, '_evaluation_master': '', '_save_checkpoints_steps': 5000, '_keep_checkpoint_every_n_hours': 10000, '_service': None, '_num_ps_replicas': 0, '_tf_random_seed': None, '_master': '', '_num_worker_replicas': 1, '_task_id': 0, '_log_step_count_steps': 50, '_model_dir': '/data/a1/data_anon16', '_save_summary_steps': 50}
INFO:tensorflow:Calling model_fn.
Traceback (most recent call last):
  File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/ari/fp16/OpenNMT-tf/bin/main.py", line 135, in <module>
    main()
  File "/home/ari/fp16/OpenNMT-tf/bin/main.py", line 118, in main
    runner.train()
  File "opennmt/runner.py", line 140, in train
    train_spec.input_fn, hooks=train_spec.hooks, max_steps=train_spec.max_steps)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 355, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 824, in _train_model
    features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 805, in _call_model_fn
    model_fn_results = self._model_fn(features=features, **kwargs)
  File "opennmt/models/model.py", line 78, in _model_fn
    _loss_op, features_shards, labels_shards, params, mode, config)
  File "opennmt/utils/parallel.py", line 148, in __call__
    outputs.append(funs[i](*args[i], **kwargs[i]))
  File "opennmt/models/model.py", line 41, in _loss_op
    logits, _ = self._build(features, labels, params, mode, config)
  File "opennmt/models/sequence_to_sequence.py", line 178, in _build
    memory_sequence_length=encoder_sequence_length)
  File "opennmt/decoders/rnn_decoder.py", line 111, in decode
    dtype=inputs.dtype)
  File "opennmt/decoders/rnn_decoder.py", line 312, in _build_cell
    memory_sequence_length=memory_sequence_length)
  File "opennmt/decoders/rnn_decoder.py", line 248, in _build_attention_mechanism
    num_units, memory, memory_sequence_length=memory_sequence_length)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/seq2seq/python/ops/attention_wrapper.py", line 408, in __init__
    name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/seq2seq/python/ops/attention_wrapper.py", line 215, in __init__
    self.memory_layer(self._values) if self.memory_layer  # pylint: disable=not-callable
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/layers/base.py", line 709, in __call__
    outputs = self.call(inputs, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/layers/core.py", line 151, in call
    inputs = ops.convert_to_tensor(inputs, dtype=self.dtype)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 950, in convert_to_tensor
    as_ref=False)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1040, in internal_convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 883, in _TensorTensorConversionFunction
    (dtype.name, t.dtype.name, str(t)))
ValueError: Tensor conversion requested dtype float32 for Tensor with dtype float16: 'Tensor("seq2seq/parallel_0/seq2seq/decoder/LuongAttention/mul:0", shape=(?, ?, 1024), dtype=float16, device=/device:GPU:0)'

File reading unicode error

When trying the quickstart example, I faced an error which is regarding file opening in
utils\misc.py
It got resolved once I changed

line 40: with open(filename) as f:
to
line 40:  with open(filename, encoding="utf8") as f:

I'll open a pull request with the fix.
Windows, py3.6, tf1.4
python -m bin.main train --model config/models/nmt_small. py --config config/opennmt-defaults.yml config/data/toy-ende.yml

INFO:tensorflow:Using config: {'_model_dir': 'toy-ende', '_tf_random_seed': None, '_save_sum
mary_steps': 50, '_save_checkpoints_steps': 5000, '_save_checkpoints_secs': None, '_session_
config': gpu_options {
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps
': 50, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec
 object at 0x000002213F038F60>, '_task_type': 'worker', '_task_id': 0, '_master': '', '_is_c
hief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Running training and evaluation locally (non-distributed).
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after 18000 secs (ev
al_spec.throttle_secs) or training is finished.
Traceback (most recent call last):
  File "C:\Users\Ayush\AppData\Local\Programs\Python\Python36\lib\runpy.py", line 193, in _r
un_module_as_main
    "__main__", mod_spec)
  File "C:\Users\Ayush\AppData\Local\Programs\Python\Python36\lib\runpy.py", line 85, in _ru
n_code
    exec(code, run_globals)
  File "C:\Users\Ayush\Projects\OpenNMT-tf\bin\main.py", line 308, in <module>
    main()
  File "C:\Users\Ayush\Projects\OpenNMT-tf\bin\main.py", line 290, in main
    train(estimator, model, config)
  File "C:\Users\Ayush\Projects\OpenNMT-tf\bin\main.py", line 135, in train
    tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
  File "C:\Users\Ayush\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\p
ython\estimator\training.py", line 430, in train_and_evaluate
    executor.run_local()
  File "C:\Users\Ayush\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\p
ython\estimator\training.py", line 609, in run_local
    hooks=train_hooks)
  File "C:\Users\Ayush\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\p
ython\estimator\estimator.py", line 302, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "C:\Users\Ayush\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\p
ython\estimator\estimator.py", line 708, in _train_model
    input_fn, model_fn_lib.ModeKeys.TRAIN)
  File "C:\Users\Ayush\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\p
ython\estimator\estimator.py", line 577, in _get_features_and_labels_from_input_fn
    result = self._call_input_fn(input_fn, mode)
  File "C:\Users\Ayush\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\p
ython\estimator\estimator.py", line 663, in _call_input_fn
    return input_fn(**kwargs)
  File "C:\Users\Ayush\Projects\OpenNMT-tf\opennmt\models\model.py", line 515, in <lambda>
    maximum_labels_length=maximum_labels_length)
  File "C:\Users\Ayush\Projects\OpenNMT-tf\opennmt\models\model.py", line 374, in _input_fn_
impl
    self._initialize(metadata)
  File "C:\Users\Ayush\Projects\OpenNMT-tf\opennmt\models\sequence_to_sequence.py", line 93,
 in _initialize
    self.source_inputter.initialize(metadata)
  File "C:\Users\Ayush\Projects\OpenNMT-tf\opennmt\inputters\text_inputter.py", line 304, in
 initialize
    self.vocabulary_size = count_lines(self.vocabulary_file) + self.num_oov_buckets
  File "C:\Users\Ayush\Projects\OpenNMT-tf\opennmt\utils\misc.py", line 42, in count_lines
    for i, _ in enumerate(f):
  File "C:\Users\Ayush\AppData\Local\Programs\Python\Python36\lib\encodings\cp1252.py", line
 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 5597: character maps
to <undefined>```

Error when running python -m bin.main train --model config/models/nmt_small.py --config config/opennmt-defaults.yml config/data/toy-ende.yml

FYI, I am using Windows 10, and the full - virtually - output is as follows
(tensorflow) E:\repos\OpenNMT-tf>python -m bin.main train --model config/models/nmt_small.py --config config/opennmt-defaults.yml config/data/toy-ende.yml
INFO:tensorflow:Using config: {'_model_dir': 'toy-ende', '_tf_random_seed': None, '_save_summary_steps': 50, '_save_checkpoints_steps': 5000, '_save_checkpoints_secs': None, '_session_config': gpu_options {
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 50, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x000001DE786BA0F0>, '_task_type': 'worker', '_task_id': 0, '_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Running training and evaluation locally (non-distributed).
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after 18000 secs (eval_spec.throttle_secs) or training is finished.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Number of trainable parameters: 59222508
2018-01-29 01:27:22.032908: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2018-01-29 01:27:22.689337: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:1105] Found device 0 with properties:
name: Quadro M600M major: 5 minor: 0 memoryClockRate(GHz): 0.8755
pciBusID: 0000:01:00.0
totalMemory: 2.00GiB freeMemory: 1.66GiB
2018-01-29 01:27:22.689509: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: Quadro M600M, pci bus id: 0000:01:00.0, compute capability: 5.0)
INFO:tensorflow:Saving checkpoints for 1 into toy-ende\model.ckpt.
INFO:tensorflow:loss = 10.4862585, step = 1
2018-01-29 01:27:49.416435: W C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:273] Allocator (GPU_0_bfc) ran out of memory trying to allocate 323.57MiB. Current allocation summary follows.
2018-01-29 01:27:49.416958: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:628] Bin (256): Total Chunks: 226, Chunks in use: 223. 56.5KiB allocated for chunks. 55.8KiB in use in bin. 9.4KiB client-requested in use in bin.
2018-01-29 01:27:49.418437: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:628] Bin (512): Total Chunks: 1, Chunks in use: 0. 768B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2018-01-29 01:27:49.418702: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:628] Bin (1024): Total Chunks: 1, Chunks in use: 1. 1.3KiB allocated for chunks. 1.3KiB in use in bin. 1.0KiB client-requested in use in bin.
2018-01-29 01:27:49.418952: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:628] Bin (2048): Total Chunks: 33, Chunks in use: 32. 79.8KiB allocated for chunks. 76.5KiB in use in bin. 58.0KiB client-requested in use in bin.
2018-01-29 01:27:49.419195: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:628] Bin (4096): Total Chunks: 111, Chunks in use: 111. 804.8KiB allocated for chunks. 804.8KiB in use in bin. 804.8KiB client-requested in use in bin.
2018-01-29 01:27:49.419397: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:628] Bin (8192): Total Chunks: 13, Chunks in use: 13. 115.5KiB allocated for chunks. 115.5KiB in use in bin. 111.0KiB client-requested in use in bin.
2018-01-29 01:27:49.419628: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:628] Bin (16384): Total Chunks: 2, Chunks in use: 2. 37.0KiB allocated for chunks. 37.0KiB in use in bin. 37.0KiB client-requested in use in bin.
2018-01-29 01:27:49.419848: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:628] Bin (32768): Total Chunks: 1, Chunks in use: 0. 48.3KiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2018-01-29 01:27:49.420079: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:628] Bin (65536): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2018-01-29 01:27:49.421086: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:628] Bin (131072): Total Chunks: 1289, Chunks in use: 1231. 161.41MiB allocated for chunks. 154.16MiB in use in bin. 153.92MiB client-requested in use in bin.
2018-01-29 01:27:49.421360: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:628] Bin (262144): Total Chunks: 268, Chunks in use: 137. 71.63MiB allocated for chunks. 38.88MiB in use in bin. 38.88MiB client-requested in use in bin.
2018-01-29 01:27:49.421600: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:628] Bin (524288): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2018-01-29 01:27:49.423608: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:628] Bin (1048576): Total Chunks: 1, Chunks in use: 1. 1.00MiB allocated for chunks. 1.00MiB in use in bin. 1.00MiB client-requested in use in bin.
2018-01-29 01:27:49.424369: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:628] Bin (2097152): Total Chunks: 4, Chunks in use: 4. 14.63MiB allocated for chunks. 14.63MiB in use in bin. 14.50MiB client-requested in use in bin.
2018-01-29 01:27:49.424834: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:628] Bin (4194304): Total Chunks: 1, Chunks in use: 1. 6.64MiB allocated for chunks. 6.64MiB in use in bin. 4.63MiB client-requested in use in bin.
2018-01-29 01:27:49.425584: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:628] Bin (8388608): Total Chunks: 41, Chunks in use: 7. 361.33MiB allocated for chunks. 64.00MiB in use in bin. 64.00MiB client-requested in use in bin.
2018-01-29 01:27:49.426064: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:628] Bin (16777216): Total Chunks: 2, Chunks in use: 1. 47.06MiB allocated for chunks. 20.83MiB in use in bin. 12.00MiB client-requested in use in bin.
2018-01-29 01:27:49.426584: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:628] Bin (33554432): Total Chunks: 1, Chunks in use: 1. 48.83MiB allocated for chunks. 48.83MiB in use in bin. 48.83MiB client-requested in use in bin.
2018-01-29 01:27:49.427393: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:628] Bin (67108864): Total Chunks: 4, Chunks in use: 4. 292.84MiB allocated for chunks. 292.84MiB in use in bin. 279.84MiB client-requested in use in bin.
2018-01-29 01:27:49.437010: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:628] Bin (134217728): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2018-01-29 01:27:49.446938: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:628] Bin (268435456): Total Chunks: 1, Chunks in use: 1. 424.90MiB allocated for chunks. 424.90MiB in use in bin. 323.57MiB client-requested in use in bin.
2018-01-29 01:27:49.449867: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:644] Bin for 323.57MiB was 256.00MiB, Chunk State:
2018-01-29 01:27:49.454072: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:662] Chunk at 0000000C010E0000 of size 1280
2018-01-29 01:27:49.466380: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:662] Chunk at 0000000C010E0500 of size 256
2018-01-29 01:27:49.474598: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:662] Chunk at 0000000C010E0600 of size 256
2018-01-29 01:27:49.486914: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:662] Chunk at 0000000C010E0700 of size 256
2018-01-29 01:27:49.489387: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:662] Chunk at 0000000C010E0800 of size 256
...
2018-01-29 01:28:05.238525: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 1 Chunks of size 21837824 totalling 20.83MiB
2018-01-29 01:28:05.253985: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 1 Chunks of size 51197952 totalling 48.83MiB
2018-01-29 01:28:05.257047: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 3 Chunks of size 73359360 totalling 209.88MiB
2018-01-29 01:28:05.272085: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 1 Chunks of size 86990848 totalling 82.96MiB
2018-01-29 01:28:05.275387: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:680] 1 Chunks of size 445540864 totalling 424.90MiB
2018-01-29 01:28:05.288695: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:684] Sum Total of in-use chunks: 1.04GiB
2018-01-29 01:28:05.306023: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:686] Stats:
Limit: 1500921856
InUse: 1119637504
MaxInUse: 1500920832
NumAllocs: 7767
MaxAllocSize: 445540864

2018-01-29 01:28:05.323922: W C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:277] ***************************************_*****************************************************xxxxxxx
2018-01-29 01:28:05.327611: W C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\framework\op_kernel.cc:1198] Resource exhausted: OOM when allocating tensor with shape[64,37,35820] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1350, in _do_call
return fn(*args)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1329, in _run_fn
status, run_metadata)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 473, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[64,37,35820] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: seq2seq/decoder/decoder_1/transpose = Transpose[T=DT_FLOAT, Tperm=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](seq2seq/decoder/decoder_1/TensorArrayStack/TensorArrayGatherV3, seq2seq/OptimizeLoss/gradients/seq2seq/encoder/rnn/transpose_grad/InvertPermutation)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

     [[Node: seq2seq/OptimizeLoss/gradients/seq2seq/decoder/decoder_1/TrainingHelperInitialize/cond/Merge_grad/tuple/control_dependency/_645 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_916_seq2seq/OptimizeLoss/gradients/seq2seq/decoder/decoder_1/TrainingHelperInitialize/cond/Merge_grad/tuple/control_dependency", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "E:\repos\OpenNMT-tf\bin\main.py", line 275, in
main()
File "E:\repos\OpenNMT-tf\bin\main.py", line 263, in main
train(estimator, model, config)
File "E:\repos\OpenNMT-tf\bin\main.py", line 125, in train
tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\estimator\training.py", line 432, in train_and_evaluate
executor.run_local()
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\estimator\training.py", line 611, in run_local
hooks=train_hooks)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\estimator\estimator.py", line 314, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\estimator\estimator.py", line 815, in _train_model
_, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\training\monitored_session.py", line 539, in run
run_metadata=run_metadata)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1013, in run
run_metadata=run_metadata)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1104, in run
raise six.reraise(*original_exc_info)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\six.py", line 693, in reraise
raise value
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1089, in run
return self._sess.run(*args, **kwargs)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1161, in run
run_metadata=run_metadata)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\training\monitored_session.py", line 941, in run
return self._sess.run(*args, **kwargs)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 895, in run
run_metadata_ptr)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1128, in _run
feed_dict_tensor, options, run_metadata)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1344, in _do_run
options, run_metadata)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1363, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[64,37,35820] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: seq2seq/decoder/decoder_1/transpose = Transpose[T=DT_FLOAT, Tperm=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](seq2seq/decoder/decoder_1/TensorArrayStack/TensorArrayGatherV3, seq2seq/OptimizeLoss/gradients/seq2seq/encoder/rnn/transpose_grad/InvertPermutation)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

     [[Node: seq2seq/OptimizeLoss/gradients/seq2seq/decoder/decoder_1/TrainingHelperInitialize/cond/Merge_grad/tuple/control_dependency/_645 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_916_seq2seq/OptimizeLoss/gradients/seq2seq/decoder/decoder_1/TrainingHelperInitialize/cond/Merge_grad/tuple/control_dependency", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

Caused by op 'seq2seq/decoder/decoder_1/transpose', defined at:
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "E:\repos\OpenNMT-tf\bin\main.py", line 275, in
main()
File "E:\repos\OpenNMT-tf\bin\main.py", line 263, in main
train(estimator, model, config)
File "E:\repos\OpenNMT-tf\bin\main.py", line 125, in train
tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\estimator\training.py", line 432, in train_and_evaluate
executor.run_local()
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\estimator\training.py", line 611, in run_local
hooks=train_hooks)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\estimator\estimator.py", line 314, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\estimator\estimator.py", line 743, in _train_model
features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\estimator\estimator.py", line 725, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "E:\repos\OpenNMT-tf\opennmt\models\model.py", line 104, in call
outputs, predictions = self._build(features, labels, params, mode, config)
File "E:\repos\OpenNMT-tf\opennmt\models\sequence_to_sequence.py", line 164, in _build
memory_sequence_length=encoder_sequence_length)
File "E:\repos\OpenNMT-tf\opennmt\decoders\rnn_decoder.py", line 119, in decode
outputs, state, length = tf.contrib.seq2seq.dynamic_decode(decoder)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\seq2seq\python\ops\decoder.py", line 324, in dynamic_decode
final_outputs = nest.map_structure(_transpose_batch_time, final_outputs)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\util\nest.py", line 387, in map_structure
structure[0], [func(*x) for x in entries])
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\util\nest.py", line 387, in
structure[0], [func(*x) for x in entries])
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\rnn.py", line 73, in _transpose_batch_time
([1, 0], math_ops.range(2, x_rank)), axis=0))
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\array_ops.py", line 1392, in transpose
ret = transpose_fn(a, perm, name=name)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 7687, in transpose
"Transpose", x=x, perm=perm, name=name)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\ops.py", line 3160, in create_op
op_def=op_def)
File "C:\Program Files\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\ops.py", line 1625, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[64,37,35820] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: seq2seq/decoder/decoder_1/transpose = Transpose[T=DT_FLOAT, Tperm=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](seq2seq/decoder/decoder_1/TensorArrayStack/TensorArrayGatherV3, seq2seq/OptimizeLoss/gradients/seq2seq/encoder/rnn/transpose_grad/InvertPermutation)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

     [[Node: seq2seq/OptimizeLoss/gradients/seq2seq/decoder/decoder_1/TrainingHelperInitialize/cond/Merge_grad/tuple/control_dependency/_645 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_916_seq2seq/OptimizeLoss/gradients/seq2seq/decoder/decoder_1/TrainingHelperInitialize/cond/Merge_grad/tuple/control_dependency", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

beam_width: 1 triggers an error

1080, pci bus id: 0000:02:00.0, compute capability: 6.1)
INFO:tensorflow:Restoring parameters from baseline-1M-enfr/model.ckpt-200000
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/moses/tensorflowwork/OpenNMT-tf/bin/main.py", line 299, in
main()
File "/home/moses/tensorflowwork/OpenNMT-tf/bin/main.py", line 293, in main
predictions_file=args.predictions_file)
File "/home/moses/tensorflowwork/OpenNMT-tf/bin/main.py", line 163, in infer
model.print_prediction(prediction, params=config["infer"], stream=stream)
File "opennmt/models/sequence_to_sequence.py", line 230, in print_prediction
tokens = prediction["tokens"][i][:prediction["length"][i] - 1] # Ignore .
IndexError: invalid index to scalar variable.

REST Server?

Hello ONMT team,

Does the Tensorflow version of OpenNMT come with a ready-to-use REST Server?

Thanks,
mzeid

Segmentation fault after building vocabularies with the default options

Just for the sake of saving some time for the others.

Demo model worked fine for me, and so I decided to try Transformer model on a bigger dataset. Namely En<->Fr giga corpus with ~20mln sentences.

But my mistake was, that I created vocabulary like was shown in tutorial, without "--size" parameter. As a result script collected all the words in the 20mln sentences, and NMT was not able to deal with such vocabulary size. And was throwing segfault to me.

Luckily after I created vocabulary of size 50000 everything worked fine.

Baseline results are far off from Lua version

@guillaumekln I redid my comparison it is not so good.

Baseline 1M enfr:
Lua (brnn 2x500): BLEU on test set without replace_unk = 35.65 (beam1) 37.00 (beam5)
62 min per epoch on a GTX1080

TF (2x500 rnn) optim sgd: BLEU 26.47 after 200k steps
1h24 per 15K steps (approx 1 epoch) on a GTX1080ti

TF (2x500 rnn) optim noam: BLEU 28.87 after 200k steps

TF (transformer) optim noam: BLEU 33.23 after 100K steps

Vocabulary builder is not compatible with Lua tokenizer

When running Lua tokenizer in BPE mode (from OpenNMT), the default delimiter is some non-printable character.

And later when I try to build vocabulary from this tokenized text, I get:

File "opennmt/utils/vocab.py", line 43, in add_from_text
    line = line.decode("utf-8").strip()
  File "/home/soul/anaconda2/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 1-2: invalid continuation byte
line = line.decode("utf-8", "replace").strip()

solves the issue

Error of Empty lines caused by '\r' in the tutorial of Transformer

I was trying the scripts under scripts/wmt . The preprocessing was succeful, but when I running the training, there appeared this error:
InvalidArgumentError (see above for traceback): Invalid content in data/wmtende.vocab: empty line found at position 283470.

The environment I used is :
python==3.5.2;
tensorflow-gpu==1.7.0;
OpenNMT-tf==1.1.0;
sentencepiece== the latest version in github

Data loss: corrupted record at xx...

I have randomly encountered " Data loss: corrupted record at xx..." with the LAS model as follows. I tried recreating record file, single/multiple GPU and single/multiple num_parallel_process_calls but anything doesn't work. Here are data I have used for training. Could you let me know If my configuration is incorrect?

[kwon@ssi-dnn-slave-002 data]$ wc -l *
      38571 dev.records
         33 dev.txt
   41002559 train.records
      37416 train.txt
         34 vocab.txt
   41078613 total
[kwon@ssi-dnn-slave-002 data]$ ls -hal
total 14G
drwxr-xr-x 2 kwon domain users   95 Nov  8 15:00 .
drwxr-xr-x 5 kwon domain users  205 Nov  9 14:48 ..
-rw-r--r-- 1 kwon domain users  14M Nov  8 15:00 dev.records
-rw-r--r-- 1 kwon domain users 5.9K Nov  8 15:00 dev.txt
-rw-r--r-- 1 kwon domain users  14G Nov  8 15:00 train.records
-rw-r--r-- 1 kwon domain users 7.3M Nov  8 15:00 train.txt
-rw-r--r-- 1 kwon domain users   81 Nov  8 15:00 vocab.txt

Here is command I used for training

CUDA_VISIBLE_DEVICES=0 python -m bin.main train \
  --model config/models/las_wsj.py \
  --config config/las_wsj.yml

Here is yml configuration

model_dir: /home/kwon/Project/wsj_kaldi_tf/exp

data:
  train_features_file: /home/kwon/Project/wsj_kaldi_tf/data/train.records
  train_labels_file: /home/kwon/Project/wsj_kaldi_tf/data/train.txt
  eval_features_file: /home/kwon/Project/wsj_kaldi_tf/data/dev.records
  eval_labels_file: /home/kwon/Project/wsj_kaldi_tf/data/dev.txt
  input_depth: 120
  target_words_vocabulary: /home/kwon/Project/wsj_kaldi_tf/data/vocab.txt

params:
  optimizer: GradientDescentOptimizer
  learning_rate: 0.2
  clip_gradients: 5.0
  decay_type: exponential_decay
  decay_rate: 1.0
  decay_steps: 10000
  staircase: true
  start_decay_steps: 50000
  scheduled_sampling_type: inverse_sigmoid
  scheduled_sampling_read_probability: 1
  scheduled_sampling_k: 7.5
  beam_width: 5
  maximum_iterations: 200

train:
  batch_size: 20
  save_checkpoints_steps: 40000
  keep_checkpoint_max: 300
  save_summary_steps: 40000
  train_steps: 1122480
  eval_delay: 7200
  save_eval_predictions: false
  maximum_features_length: 2435
  maximum_labels_length: 245
  num_buckets: 5
  num_parallel_process_calls: 1
  buffer_size: 32726
  input_depth: 120

infer:
  batch_size: 3
  num_parallel_process_calls: 1
  buffer_size: 100
  n_best: 1

Here is the model I used for LAS

import tensorflow as tf
import opennmt as onmt

def model():
  return onmt.models.SequenceToSequence(
    source_inputter=onmt.inputters.SequenceRecordInputter(
      input_depth_key="input_depth"),
    target_inputter=onmt.inputters.WordEmbedder(
      vocabulary_file_key="target_words_vocabulary",
      embedding_size=20),
    encoder=onmt.encoders.PyramidalRNNEncoder(
      num_layers=3,
      num_units=256,
      reduction_factor=2,
      cell_class=tf.contrib.rnn.LSTMCell,
      dropout=0),
    decoder=onmt.decoders.MultiAttentionalRNNDecoder(
      num_layers=3,
      num_units=256,
      attention_layers=[0],
      attention_mechanism_class=tf.contrib.seq2seq.LuongAttention,
      cell_class=tf.contrib.rnn.LSTMCell,
      dropout=0,
      residual_connections=False))

Logs

2017-11-09 13:56:04.155749: I tensorflow/core/kernels/shuffle_dataset_op.cc:110] Filling up shuffle buffer (this may take a while): 15594 of 32726
2017-11-09 13:56:14.154468: I tensorflow/core/kernels/shuffle_dataset_op.cc:110] Filling up shuffle buffer (this may take a while): 20877 of 32726
2017-11-09 13:56:23.832362: W tensorflow/core/framework/op_kernel.cc:1192] Data loss: corrupted record at 9945442474
Traceback (most recent call last):
2017-11-08 11:00:07.539005: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, nam
e: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1)
2017-11-08 11:00:14.879136: W tensorflow/core/framework/op_kernel.cc:1192] Data loss: corrupted record at 4135444529
Traceback (most recent call last):
INFO:tensorflow:loss = 2.90062, step = 3939 (249.989 sec)
INFO:tensorflow:loss = 2.91381, step = 4039 (252.683 sec)
INFO:tensorflow:loss = 2.90139, step = 4139 (248.092 sec)
INFO:tensorflow:loss = 2.90898, step = 4239 (250.705 sec)
2017-11-09 01:20:41.776980: I tensorflow/core/kernels/shuffle_dataset_op.cc:110] Filling up shuffle buffer (this may take a while): 29790 of 40000
2017-11-09 01:20:44.049085: W tensorflow/core/framework/op_kernel.cc:1192] Data loss: corrupted record at 13966444711
INFO:tensorflow:loss = 2.97916, step = 1002 (204.074 sec)
INFO:tensorflow:loss = 2.94589, step = 1102 (208.752 sec)
INFO:tensorflow:loss = 2.93662, step = 1202 (214.285 sec)
2017-11-07 23:27:42.923039: W tensorflow/core/framework/op_kernel.cc:1192] Data loss: corrupted record at 16288775880

gpu_allow_growth is not working

Calling the tutorial code with additional argument gpu_allow_growth

python -m bin.main train_and_eval --gpu_allow_growth --model config/models/nmt_small.py --config config/opennmt-defaults.yml config/data/toy-ende.yml

does not prevent GPU memory being fully allocated. After some digging, it can be found that a call in https://github.com/OpenNMT/OpenNMT-tf/blob/master/opennmt/utils/parallel.py#L27 is causing this.

It seems that this issue is already brought up on a stackoverflow question

Transformer model produces "." as output

Hello,
I've been trying to train a transformer model using the latest code version.
I tried using tensorflow 1.6 (had to convert the code to python3 for that) as well as tensorflow 1.5.
However, I get only "." as the output on testing the model using the toy-ende dataset provided in the distribution. I use the transformer.py and the transformer1gpu.yml as the config files.
On the other hand, training a vanilla RNN model (with nmt_small.py and opennmtdefaults.yml) works fine and produces legit outputs. Any lead on this?

Shared vocabulary

Hi,
I'm not sure if I missed to see it, is a shared (source & target) vocabulary feature already present?
Thanks!

CPU memory leak when using train_and_eval run type

I'm currently running some experiments using the sequence_to_sequence model.
I noticed that, when using the train_and_eval run type, the RAM used, I mean the CPU RAM not the GPU, by the process increases during time. More specifically it increase after each evaluation. It seems that the memory used is not released after the evaluation period and each call to the evaluation hooks increases the memory used.

I didn't noticed such behavior when using only train run type.

Have someone noticed the same problem?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.