google / seq2seq Goto Github PK

View Code? Open in Web Editor NEW

5.6K 248.0 1.3K 1.63 MB

A general-purpose encoder-decoder framework for Tensorflow

Home Page: https://google.github.io/seq2seq/

License: Apache License 2.0

Python 94.55% Shell 3.05% CSS 0.20% JavaScript 1.01% Perl 1.20%

tensorflow translation machine-translation neural-network deeplearning

seq2seq's Issues

How to set the train data?

How to set the downloaded train data for this file?

Write Image Captioning Walkthrough

Blocked by #18

The documentation should have an end-to-end walkthrough of training and evaluating an Image Captioning model using standard datasets.

Refactor bridge_spec

Instead of having a bridge_spec parameter we should have bridge.class and bridge.params to keep it consistent with the rest of the parameters.

find an error when run command:python -m unittest seq2seq.test.pipeline_test

~/desktop/code/python_program/seq2seq$ python -m unittest seq2seq.test.pipeline_test
.E

ERROR: test_train_infer (seq2seq.test.pipeline_test.PipelineTest)
Tests training and inference scripts.

Traceback (most recent call last):
File "seq2seq/test/pipeline_test.py", line 78, in test_train_infer
os.path.join(BIN_FOLDER, "train.py"))
UnicodeEncodeError: 'ascii' codec can't encode characters in position 11-12: ordinal not in range(128)

Ran 2 tests in 0.003s

FAILED (errors=1)

Check docstrings

Due to a lot of refactoring many Python docstrings are currently outdated. Need to go through the code, make sure they are still correct, and update where necessary.

failed to allocate 11.90G CUDA_ERROR_OUT_OF_MEMORY

When i try the WMT'16 EN-DE sample, encountered the following CUDA_ERROR_OUT_OF_MEMORY:

name: TITAN X (Pascal)
major: 6 minor: 1 memoryClockRate (GHz) 1.531
pciBusID 0000:01:00.0
Total memory: 11.90GiB
Free memory: 11.39GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: TITAN X (Pascal), pci bus id: 0000:01:00.0)
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 11.90G (12778405888 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
Traceback (most recent call last):
File "/usr/lib/python3.4/runpy.py", line 170, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.4/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/media/sbai/7A9C9BED9C9BA1E5/DL/seq2seq/bin/train.py", line 251, in
tf.app.run()
File "/home/sbai/tf134/lib/python3.4/site-packages/tensorflow/python/platform/app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "/media/sbai/7A9C9BED9C9BA1E5/DL/seq2seq/bin/train.py", line 246, in main
schedule=FLAGS.schedule)
File "/home/sbai/tf134/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/learn_runner.py", line 106, in run
return task()
File "/home/sbai/tf134/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/experiment.py", line 459, in train_and_evaluate
self.train(delay_secs=0)
File "/home/sbai/tf134/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/experiment.py", line 281, in train
monitors=self._train_monitors + extra_hooks)
File "/home/sbai/tf134/lib/python3.4/site-packages/tensorflow/python/util/deprecation.py", line 280, in new_func
return func(*args, **kwargs)
File "/home/sbai/tf134/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 426, in fit
loss = self._train_model(input_fn=input_fn, hooks=hooks)
File "/home/sbai/tf134/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 984, in _train_model
_, loss = mon_sess.run([model_fn_ops.train_op, model_fn_ops.loss])
File "/home/sbai/tf134/lib/python3.4/site-packages/tensorflow/python/training/monitored_session.py", line 462, in run
run_metadata=run_metadata)
File "/home/sbai/tf134/lib/python3.4/site-packages/tensorflow/python/training/monitored_session.py", line 786, in run
run_metadata=run_metadata)
File "/home/sbai/tf134/lib/python3.4/site-packages/tensorflow/python/training/monitored_session.py", line 744, in run
return self._sess.run(*args, **kwargs)
File "/home/sbai/tf134/lib/python3.4/site-packages/tensorflow/python/training/monitored_session.py", line 883, in run
feed_dict, options)
File "/home/sbai/tf134/lib/python3.4/site-packages/tensorflow/python/training/monitored_session.py", line 909, in _call_hook_before_run
request = hook.before_run(run_context)
File "/media/sbai/7A9C9BED9C9BA1E5/DL/seq2seq/seq2seq/training/hooks.py", line 239, in before_run
"predicted_tokens": self._pred_dict["predicted_tokens"],
KeyError: 'predicted_tokens'

Env: TF1.0 GPU & Python3.4 & ubuntu14.04
I changed the batch size and the num_units into a smaller number, but still encountered the same error.
I tried toy data, met the same error.
Is it because I am using python3.4?

############################### update ###############
I tried it on Python3.5, got the same error at the first try, and got following error when i tried again:

WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/monitors.py:267: BaseMonitor.init (from tensorflow.contrib.learn.python.learn.monitors) is deprecated and will be removed after 2016-12-05.
Instructions for updating:
Monitors are deprecated. Please use tf.train.SessionRunHook.
*** Error in `python3.5': double free or corruption (!prev): 0x0000000002870d90 ***
Aborted (core dumped)

Configurable metrics (support google/sentencepiece)

Currently, the decoding and metrics are hardcoded to:

Join tokens on space
Automatically deal with the BPE segmentation token @@

This should be configurable and should be able to support processing done by google/sentencepiece. What needs to happen is to allow users to pass parameters to metrics.

Model Ensembles

What is the best way to implement model ensembling per time step in Tensorflow? Models are ensembled by averaging the output probabilities at each decoding step. Is there a way to do this using raw_rnn?

why there is always error "ValueError: Input Pipeline definition must have a class property."

here is my config.yml

model: BasicSeq2Seq
model_params:
bridge.class: seq2seq.models.bridges.InitialStateBridge
embedding.dim: 1024
encoder.class: seq2seq.encoders.UnidirectionalRNNEncoder
encoder.params:
rnn_cell:
cell_class: BasicLSTMCell
cell_params:
num_units: 512
dropout_input_keep_prob: 0.8
dropout_output_keep_prob: 1.0
num_layers: 1
decoder.class: seq2seq.decoders.BasicDecoder
decoder.params:
rnn_cell:
cell_class: BasicLSTMCell
cell_params:
num_units: 512
dropout_input_keep_prob: 0.8
dropout_output_keep_prob: 1.0
num_layers: 1
optimizer.name: Adam
optimizer.learning_rate: 0.0001
source.max_seq_len: 83
source.reverse: false
target.max_seq_len: 95
vocab_source: ./data/vocab_post_50000
vocab_target: ./data/vocab_comt_50000
input_pipeline_train:
class: ParallelTextInputPipeline
params:
source_files: ['./data/post_50000']
target_files: ['./data/comt_50000']
batch_size: 64
train_steps: 1000
output_dir: ./model/50000

when I use the command below to run the model and there is always an error: "ValueError: Input Pipeline definition must have a class property"
python3 -m bin.train --config_paths="./myconfig/config_50000.yml,./myconfig/train_seq2seq.yml"

Refactor model base class

The _build method of the model base class is currently very long. If a subclass wants to overwrite this method it needs to copy the full code and change the relevant parts. It should be possible to refactor the method into smaller ones (embedding, encoding, decoding, loss, etc) so that parts are easier to swap out.

nvalidArgumentError (see above for traceback): Tried to read from index 32 but array size is: 32

Parsing GraphDef...
Parsing RunMetadata...
Parsing OpLog...
Preparing Views...
Traceback (most recent call last):
File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/disk1/mouna/code/seq2seq/bin/train.py", line 251, in
tf.app.run()
File "/usr/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "/disk1/mouna/code/seq2seq/bin/train.py", line 246, in main
schedule=FLAGS.schedule)
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/learn_runner.py", line 106, in run
return task()
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/experiment.py", line 459, in train_and_evaluate
self.train(delay_secs=0)
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/experiment.py", line 281, in train
monitors=self._train_monitors + extra_hooks)
File "/usr/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 280, in new_func
return func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 426, in fit
loss = self._train_model(input_fn=input_fn, hooks=hooks)
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 984, in _train_model
_, loss = mon_sess.run([model_fn_ops.train_op, model_fn_ops.loss])
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 462, in run
run_metadata=run_metadata)
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 786, in run
run_metadata=run_metadata)
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 744, in run
return self._sess.run(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 899, in run
run_metadata=run_metadata))
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/monitors.py", line 1157, in after_run
induce_stop = m.step_end(self._last_step, result)
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/monitors.py", line 356, in step_end
return self.every_n_step_end(step, output)
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/monitors.py", line 657, in every_n_step_end
steps=self.eval_steps, metrics=self.metrics, name=self.name)
File "/usr/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 280, in new_func
return func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 514, in evaluate
log_progress=log_progress)
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 836, in _evaluate_model
hooks=hooks)
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/training/python/training/evaluation.py", line 430, in evaluate_once
session.run(eval_ops, feed_dict)
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 462, in run
run_metadata=run_metadata)
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 786, in run
run_metadata=run_metadata)
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 744, in run
return self._sess.run(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 891, in run
run_metadata=run_metadata)
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 744, in run
return self._sess.run(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 767, in run
run_metadata_ptr)
File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 965, in _run
feed_dict_string, options, run_metadata)
File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1015, in _do_run
target_list, options, run_metadata)
File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1035, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Tried to read from index 32 but array size is: 32
[[Node: model/att_seq2seq/decode/attention_decoder_1/decoder/while/CustomHelperNextInputs/TrainingHelperNextInputs/cond/TensorArrayReadV3 = TensorArrayReadV3[_class=["loc:@model/att_seq2seq/decode/TrainingHelper/TensorArray"], dtype=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](model/att_seq2seq/decode/attention_decoder_1/decoder/while/CustomHelperNextInputs/TrainingHelperNextInputs/cond/TensorArrayReadV3/Switch, model/att_seq2seq/decode/attention_decoder_1/decoder/while/CustomHelperNextInputs/TrainingHelperNextInputs/cond/TensorArrayReadV3/Switch_1/_463, model/att_seq2seq/decode/attention_decoder_1/decoder/while/CustomHelperNextInputs/TrainingHelperNextInputs/cond/TensorArrayReadV3/Switch_2)]]

Caused by op u'model/att_seq2seq/decode/attention_decoder_1/decoder/while/CustomHelperNextInputs/TrainingHelperNextInputs/cond/TensorArrayReadV3', defined at:
File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/disk1/mouna/code/seq2seq/bin/train.py", line 251, in
tf.app.run()
File "/usr/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "/disk1/mouna/code/seq2seq/bin/train.py", line 246, in main
schedule=FLAGS.schedule)
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/learn_runner.py", line 106, in run
return task()
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/experiment.py", line 459, in train_and_evaluate
self.train(delay_secs=0)
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/experiment.py", line 281, in train
monitors=self._train_monitors + extra_hooks)
File "/usr/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 280, in new_func
return func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 426, in fit
loss = self._train_model(input_fn=input_fn, hooks=hooks)
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 984, in _train_model
_, loss = mon_sess.run([model_fn_ops.train_op, model_fn_ops.loss])
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 462, in run
run_metadata=run_metadata)
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 786, in run
run_metadata=run_metadata)
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 744, in run
return self._sess.run(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 899, in run
run_metadata=run_metadata))
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/monitors.py", line 1157, in after_run
induce_stop = m.step_end(self._last_step, result)
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/monitors.py", line 356, in step_end
return self.every_n_step_end(step, output)
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/monitors.py", line 657, in every_n_step_end
steps=self.eval_steps, metrics=self.metrics, name=self.name)
File "/usr/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 280, in new_func
return func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 514, in evaluate
log_progress=log_progress)
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 810, in _evaluate_model
eval_ops = self._get_eval_ops(features, labels, metrics)
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 1190, in _get_eval_ops
features, labels, model_fn_lib.ModeKeys.EVAL)
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 1133, in _call_model_fn
model_fn_results = self._model_fn(features, labels, **kwargs)
File "/disk1/mouna/code/seq2seq/bin/train.py", line 164, in model_fn
return model(features, labels, params)
File "seq2seq/models/model_base.py", line 111, in call
return self._build(features, labels, params)
File "seq2seq/models/seq2seq_model.py", line 263, in _build
decoder_output, _, = self.decode(encoder_output, features, labels)
File "seq2seq/graph_utils.py", line 38, in func_wrapper
return templated_func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/template.py", line 276, in call
return self._call_func(args, kwargs, check_for_new_variables=False)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/template.py", line 216, in _call_func
result = self._func(*args, **kwargs)
File "seq2seq/models/basic_seq2seq.py", line 124, in decode
labels)
File "seq2seq/models/basic_seq2seq.py", line 87, in _decode_train
return decoder(decoder_initial_state, helper_train)
File "seq2seq/graph_module.py", line 57, in call
return self._template(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/template.py", line 267, in call
return self._call_func(args, kwargs, check_for_new_variables=False)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/template.py", line 216, in _call_func
result = self._func(*args, **kwargs)
File "seq2seq/decoders/rnn_decoder.py", line 110, in _build
maximum_iterations=maximum_iterations)
File "seq2seq/contrib/seq2seq/decoder.py", line 282, in dynamic_decode
swap_memory=swap_memory)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2605, in while_loop
result = context.BuildLoop(cond, body, loop_vars, shape_invariants)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2438, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2388, in BuildLoop
body_result = body(*packed_vars_for_body)
File "seq2seq/contrib/seq2seq/decoder.py", line 242, in body
decoder_finished) = decoder.step(time, inputs, state)
File "seq2seq/decoders/attention_decoder.py", line 186, in step
time=time, outputs=outputs, state=cell_state, sample_ids=sample_ids)
File "seq2seq/contrib/seq2seq/helper.py", line 125, in next_inputs
time=time, outputs=outputs, state=state, sample_ids=sample_ids)
File "seq2seq/decoders/attention_decoder.py", line 154, in att_next_inputs
name=name)
File "seq2seq/contrib/seq2seq/helper.py", line 204, in next_inputs
lambda: nest.map_structure(read_from_ta, self._input_tas))
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 1745, in cond
_, res_f = context_f.BuildCondBranch(fn2)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 1639, in BuildCondBranch
r = fn()
File "seq2seq/contrib/seq2seq/helper.py", line 204, in
lambda: nest.map_structure(read_from_ta, self._input_tas))
File "/usr/lib/python2.7/site-packages/tensorflow/python/util/nest.py", line 302, in map_structure
structure[0], [func(*x) for x in entries])
File "seq2seq/contrib/seq2seq/helper.py", line 200, in read_from_ta
return inp.read(next_time)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/tensor_array_ops.py", line 250, in read
name=name)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 2421, in _tensor_array_read_v3
name=name)
File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
op_def=op_def)
File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2327, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1226, in init
self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Tried to read from index 32 but array size is: 32
[[Node: model/att_seq2seq/decode/attention_decoder_1/decoder/while/CustomHelperNextInputs/TrainingHelperNextInputs/cond/TensorArrayReadV3 = TensorArrayReadV3[_class=["loc:@model/att_seq2seq/decode/TrainingHelper/TensorArray"], dtype=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](model/att_seq2seq/decode/attention_decoder_1/decoder/while/CustomHelperNextInputs/TrainingHelperNextInputs/cond/TensorArrayReadV3/Switch, model/att_seq2seq/decode/attention_decoder_1/decoder/while/CustomHelperNextInputs/TrainingHelperNextInputs/cond/TensorArrayReadV3/Switch_1/_463, model/att_seq2seq/decode/attention_decoder_1/decoder/while/CustomHelperNextInputs/TrainingHelperNextInputs/cond/TensorArrayReadV3/Switch_2)]]

Should be able to pass configs with newlines

When passing comma-delimited configs we should strip out newlines.

Prepare WMT'17 Datasets

We should prepare datasets for All WMT'17 language pairs. This is also a change to try out google/sentencepiece as a preprocessor.

Each dataset should come in different configurations, i.e. different vocabulary sizes and also have a character-level version.

Together with the raw data files we also need the script that was used for the process.

During evaluation metric_fn is called epoch_size/batch_size times with the same data

During debugging of bug #39 I found that metric_fn was called many, many times with the same data. So i probed further.

I dumped every call to metric_fn in a file. It grows by batch_size for every call, until it's called with the entire (dev) dataset. The 32 first rows in metric-dump-02-hyp are equal to the rows in metric-dump-01-hyp and so forth. This seems redundant.

I'm worried how this affects the metrics reported. Is it the last call? the first? the average?

      1 metric-dump-00-hyp.txt
      1 metric-dump-00-ref.txt
     32 metric-dump-01-hyp.txt
     32 metric-dump-01-ref.txt
     64 metric-dump-02-hyp.txt
     64 metric-dump-02-ref.txt
     96 metric-dump-03-hyp.txt
     96 metric-dump-03-ref.txt
    128 metric-dump-04-hyp.txt
    128 metric-dump-04-ref.txt
    160 metric-dump-05-hyp.txt
    160 metric-dump-05-ref.txt
    192 metric-dump-06-hyp.txt
    192 metric-dump-06-ref.txt
    224 metric-dump-07-hyp.txt
    224 metric-dump-07-ref.txt
    256 metric-dump-08-hyp.txt
    256 metric-dump-08-ref.txt
    288 metric-dump-09-hyp.txt
    288 metric-dump-09-ref.txt
    320 metric-dump-10-hyp.txt
    320 metric-dump-10-ref.txt
    352 metric-dump-11-hyp.txt
    352 metric-dump-11-ref.txt
    384 metric-dump-12-hyp.txt
    384 metric-dump-12-ref.txt
    416 metric-dump-13-hyp.txt
    416 metric-dump-13-ref.txt
    448 metric-dump-14-hyp.txt
    448 metric-dump-14-ref.txt
    480 metric-dump-15-hyp.txt
    480 metric-dump-15-ref.txt
    512 metric-dump-16-hyp.txt
    512 metric-dump-16-ref.txt
    544 metric-dump-17-hyp.txt
    544 metric-dump-17-ref.txt
    576 metric-dump-18-hyp.txt
    576 metric-dump-18-ref.txt
    608 metric-dump-19-hyp.txt
    608 metric-dump-19-ref.txt
    640 metric-dump-20-hyp.txt
    640 metric-dump-20-ref.txt
    672 metric-dump-21-hyp.txt
    672 metric-dump-21-ref.txt
    704 metric-dump-22-hyp.txt
    704 metric-dump-22-ref.txt
    736 metric-dump-23-hyp.txt
    736 metric-dump-23-ref.txt
    768 metric-dump-24-hyp.txt
    768 metric-dump-24-ref.txt
    800 metric-dump-25-hyp.txt
    800 metric-dump-25-ref.txt
    832 metric-dump-26-hyp.txt
    832 metric-dump-26-ref.txt
    864 metric-dump-27-hyp.txt
    864 metric-dump-27-ref.txt
    893 metric-dump-28-hyp.txt
    893 metric-dump-28-ref.txt
    893 metric-dump-29-hyp.txt
    893 metric-dump-29-ref.txt

Implement Pooling/Convolutional Encoder

Implement the convolutional encoder described in https://arxiv.org/abs/1611.02344.
Implement the baseline pooling encoder with position embeddings from the same paper.

Serving using ExportStrategy

Should figure out how to export models for serving, I think Tensorflow does provide something like an ExportStrategy that can be passed to the estimator and it will occasionally export the model.

Generate pre-processed EN-DE translation dataset

We currently have a script that generates translation data, but running it can take an hour or wo, mostly due to the BPE processing. We should create a dataset that users can simply download.

The dataset probably only needs to include the 32k vocabulary BPE, not all of them.

When metric_fn are called references starts with SEQUENCE_START and hypothesis does not.

This makes a simple accuracy metric that checks hyp==ref break

I dunno if this is intended, but at least to me it was a surprise.

Thanks for the great lib btw.

Internal: Failed to run py callback pyfunc_7

Hi,
I am running the code using default configuration (nmt_small.yaml, changed the size of hidden layer from 128 to 50) using TITANX. The first 1000 training steps are good. But then the evaluation failed with the follow errors:

tensorflow/core/framework/op_kernel.cc:993] Internal: Failed to run py callback pyfunc_2: see error log.
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/script_ops.py", line 82, in call
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/script_ops.py", line 82, in call
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/script_ops.py", line 82, in call
ret = func(*args)
File "/home/ultralisksu/host/seq2seq/seq2seq/metrics/metric_specs.py", line 156, in _py_func
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/script_ops.py", line 82, in call
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/script_ops.py", line 82, in call
ret = func(*args)
File "/home/ultralisksu/host/seq2seq/seq2seq/metrics/metric_specs.py", line 156, in _py_func
ret = func(*args)
File "/home/ultralisksu/host/seq2seq/seq2seq/metrics/metric_specs.py", line 156, in _py_func
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/script_ops.py", line 82, in call
return self.metric_fn(sliced_hypotheses, sliced_references)
File "/home/ultralisksu/host/seq2seq/seq2seq/metrics/metric_specs.py", line 206, in metric_fn
return self.metric_fn(sliced_hypotheses, sliced_references)

I attached the full logs.
log.log.txt

Thanks

seg fault when running nose tests

Installed according to the guide on the contribution page.

Im running:

Ubuntu 14.04
tensorflow 1.0.0
Python 3.5.1 (default, Mar 14 2017, 15:32:51)

ImportError: cannot import name contrib

Hi all,
Following the page at https://google.github.io/seq2seq/getting_started/, got error

user@localhost:~/Desktop/seq2seq$ pip install -e .
Obtaining file:///home/user/Desktop/seq2seq
Requirement already satisfied: numpy in /home/user/anaconda2/lib/python2.7/site-packages (from seq2seq==0.1)
Requirement already satisfied: matplotlib in /home/user/anaconda2/lib/python2.7/site-packages (from seq2seq==0.1)
Requirement already satisfied: pyyaml in /home/user/anaconda2/lib/python2.7/site-packages (from seq2seq==0.1)
Requirement already satisfied: pyrouge in /home/user/anaconda2/lib/python2.7/site-packages (from seq2seq==0.1)
Requirement already satisfied: six>=1.10 in /home/user/anaconda2/lib/python2.7/site-packages (from matplotlib->seq2seq==0.1)
Requirement already satisfied: python-dateutil in /home/user/anaconda2/lib/python2.7/site-packages (from matplotlib->seq2seq==0.1)
Requirement already satisfied: functools32 in /home/user/anaconda2/lib/python2.7/site-packages (from matplotlib->seq2seq==0.1)
Requirement already satisfied: subprocess32 in /home/user/anaconda2/lib/python2.7/site-packages (from matplotlib->seq2seq==0.1)
Requirement already satisfied: pytz in /home/user/anaconda2/lib/python2.7/site-packages (from matplotlib->seq2seq==0.1)
Requirement already satisfied: cycler>=0.10 in /home/user/anaconda2/lib/python2.7/site-packages (from matplotlib->seq2seq==0.1)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=1.5.6 in /home/user/anaconda2/lib/python2.7/site-packages (from matplotlib->seq2seq==0.1)
Installing collected packages: seq2seq
  Running setup.py develop for seq2seq
Successfully installed seq2seq
user@localhost:~/Desktop/seq2seq$ python -m unittest seq2seq.test.pipeline_test
Traceback (most recent call last):
  File "/home/user/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/home/user/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/user/anaconda2/lib/python2.7/unittest/__main__.py", line 12, in 
    main(module=None)
  File "/home/user/anaconda2/lib/python2.7/unittest/main.py", line 94, in __init__
    self.parseArgs(argv)
  File "/home/user/anaconda2/lib/python2.7/unittest/main.py", line 149, in parseArgs
    self.createTests()
  File "/home/user/anaconda2/lib/python2.7/unittest/main.py", line 158, in createTests
    self.module)
  File "/home/user/anaconda2/lib/python2.7/unittest/loader.py", line 130, in loadTestsFromNames
    suites = [self.loadTestsFromName(name, module) for name in names]
  File "/home/user/anaconda2/lib/python2.7/unittest/loader.py", line 91, in loadTestsFromName
    module = __import__('.'.join(parts_copy))
  File "seq2seq/__init__.py", line 24, in 
    from seq2seq import contrib
ImportError: cannot import name contrib

PS: It should be
pip install -e .
at https://google.github.io/seq2seq/getting_started/

Support Image Captioning (and other tasks)

In theory it should be easy to support Image Captioning by just swapping out the encoder with something like ResNet/Inception (e.g. tensorflow.contrib.slim.python.slim.nets.inception_v3). However, there are a few things that need to happen to support problems other than text-to-text.

Currently, the parameters to the train/inference scripts are specific to text Sequence-To-Sequence, e.g. source_vocabulary, source_delimiter, etc. We probably need another abstraction layer that defines what kind of task the user is solving and adjust flags/parameters based on it. For example, I could imagine having a Task class, with TextToText, ImageToText, ..., subclasses. The user then passes the type of task as part of the config and the task class is responsible for setting the appropriate parameters and creating the model.
Support for pre-trained networks. For example, when training image captioning models one typically initializes the encoder network with pre-trained image classification network weights. This can probably the done through some kind of SessionRunHook that loads a subset of the variables. In other words, the hooks used in the training script must be configurable.

Bring Test Coverage back to above 98%

Due to a lot of recent refactoring the test coverage has fallen significantly. Take a look at the CircleCI coverage report to bring it back close to above 98%.

The coverage reports can be found by clicking on the latest build -> Artifacts -> Coverage -> index.html

run pipeline_test.py

I have install seq2seq sucessfully in window 10 with tensorflow-gpu (1.0)
when I run the seq2seq.test.pipeline_test.py
it Prompt errors:

Traceback (most recent call last):
File "F:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\ops\script_ops.py", line 85, in call
ret = func(*args)
File "H:\java_pro\tensorflow\project_src\seq2seq-master\seq2seq-master\seq2seq\metrics\metric_specs.py", line 132, in _py_func
return self.metric_fn(sliced_hypotheses, sliced_references)
File "H:\java_pro\tensorflow\project_src\seq2seq-master\seq2seq-master\seq2seq\metrics\metric_specs.py", line 157, in metric_fn
return bleu.moses_multi_bleu(hypotheses, references, lowercase=False)
File "H:\java_pro\tensorflow\project_src\seq2seq-master\seq2seq-master\seq2seq\metrics\bleu.py", line 71, in moses_multi_bleu
with open(hypothesis_file.name, "r") as read_pred:
PermissionError: [Errno 13] Permission denied: 'C:\Users\gdy\AppData\Local\Temp\tmpjo9ekggj'
W c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\framework\op_kernel.cc:993] Internal: Failed to run py callback pyfunc_0: see error log.
EF:\Program Files\Anaconda3\lib\unittest\case.py:628: ResourceWarning: unclosed file <_io.BufferedRandom name=3>
outcome.errors.clear()
F:\Program Files\Anaconda3\lib\unittest\case.py:628: ResourceWarning: unclosed file <_io.BufferedRandom name=4>
outcome.errors.clear()
F:\Program Files\Anaconda3\lib\unittest\case.py:628: ResourceWarning: unclosed file <_io.BufferedRandom name=5>
outcome.errors.clear()
F:\Program Files\Anaconda3\lib\unittest\case.py:628: ResourceWarning: unclosed file <_io.BufferedRandom name=6>
outcome.errors.clear()
F:\Program Files\Anaconda3\lib\unittest\case.py:628: ResourceWarning: unclosed file <_io.BufferedRandom name=7>
outcome.errors.clear()
F:\Program Files\Anaconda3\lib\unittest\case.py:628: ResourceWarning: unclosed file <_io.BufferedRandom name=8>
outcome.errors.clear()

Train an Image Captioning model on MS COCO Data

Blocked by #12

Generate dataset using https://github.com/tensorflow/models/tree/master/im2txt (there may already be a version floating around)
Train a model and evaluating ROUGE/BLEU

Error from the simple pipeline unit test

When I was executing

python -m unittest seq2seq.test.pipeline_test

there was an a error and following is the log:

I tensorflow/core/common_runtime/gpu/pool_allocator.cc:247] PoolAllocator: After 5705 get requests, put_count=5553 evicted_count=1000 eviction_rate=0.180083 and unsatisfied allocation rate=0.219457
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:259] Raising pool_size_limit_ from 100 to 110
INFO:tensorflow:Performing full trace on next step.
I tensorflow/stream_executor/dso_loader.cc:126] Couldn't open CUDA library libcupti.so.8.0. LD_LIBRARY_PATH: /usr/local/cuda-8.0/lib64:/usr/local/cuda-8.0/lib64:/usr/local/cuda-8.0/lib64:/usr/local/cuda-8.0/lib64:
F tensorflow/core/platform/default/gpu/cupti_wrapper.cc:59] Check failed: ::tensorflow::Status::OK() == (::tensorflow::Env::Default()->GetSymbolFromLibrary( GetDsoHandle(), kName, &f)) (OK vs. Not found: /home/dl/anaconda2/envs/tf/lib/python2.7/site-packages/tensorflow/python/_pywrap_tensorflow.so: undefined symbol: cuptiActivityRegisterCallbacks)could not find cuptiActivityRegisterCallbacksin libcupti DSO

There's a fatal in the last line and I'm not sure how to fix this. This bug appeared only after I have git pulled the repository today, some version in the last week worked fine. Can someone please help?

Ubuntu 14.04, Cuda 8.0

Preprocessing using tf.Transform

tf.Transform is a new library for TensorFlow that allows users to define preprocessing pipelines. Not sure how mature and easy to integrate it is, but it's worth looking into.

Replicate GNMT architecture

To replicate the GNMT architecture, the following needs to happen. This list is not exhaustive and other things may be required:

Implement a decoder that applies attention to the first level of the cell. This should be straightforward and only require a few lines of code change.
Encoder with bidirectional encoder only in the first layer. This also should be straightforward and require < 100 lines of code to add a new encoder.
Add "Residual connections start from the layer third from the bottom in the encoder and decoder." This may require a new cell type.
Optimizer switching: Add parameters to switch optimizers during training, e.g. start with Adam and switch to SGD with learning rate decay later on. Probably ~50 lines of code to add new hyperparameters for optimizer switching.
Use google/sentencepiece to pre-process data
Add pruning heuristics to beam search. This may or may not need significant changes.
There are a few gradient clipping tricks that are not mentioned in the paper, but need to figure out the details for that.

This issue is only here to keep track of the high-level tasks. All of the points above should probably be done in separate issues.

Adding GLEU score

nltk has a simple implementation of GLEU score as described in Wu et al.'s (2016) GMNT system: https://github.com/nltk/nltk/blob/develop/nltk/translate/gleu_score.py

It'll be great if the seq2seq has a similar port of GLEU =)

Custom Training Loops for RL/GAN

Add support for custom training loops to easily train GANs and RL algorithms. Two ideas on how to to implement this:

Make it part of the model and use SessionRunHooks to execute different train ops
Subclass Estimator with a custom MonitoredSession

Need to look into these options in more details.

Error in 'python': double free or corruption (!prev)

I was playing with the toy data and this error appeared when I ran the 'Decoding with Beam Search' script. What's the probable reason and how can I fix it? Thank you.

Write Summarization Walkthrough

The documentation should have an end-to-end walkthrough of training and evaluating a Summarizaiton model, including installing and evaluating ROUGE. This will be very similar to the Machine Translation walkthrough #16. Unfortunately we can't publish the data for this.

Move `create_predictions` into model class

Move the create_predictions functions into the model class. This makes it easier for subclasses to overwrite this method.

Error in test_train_infer

So, I have cloned the repo and am trying to start using it. As a sensible first step I decided to run the provided test, only to be greeted with this:

http://pastebin.com/9XBrTb7m

I tried reinstalling all I could, and I am pretty sure all other required pieces are as up-to-date as they could ever be and are functioning properly. I am using OS X 10.12 and/or 10.11 --- the error is present in both.

What could be the source of this and will it hinder my ability to use the software?

NameError: global name 'rouge_r_f' is not definedNameError:

Correcting the variable names in rouge.py works fine

Write Machine Translation Walkthrough

The documentation should have an end-to-end walkthrough of training and evaluating a Machine Translation model using standard datasets and BLEU scripts.

wmt16_en_de.sh throws No such file or directory: '--max_vocab_size'

Input sentences: 2999 Output sentences: 2997
Cleaning /Users/nchan/programs/git-ws/seq2seq/bin/data/output-all//train...
clean-corpus.perl: processing /Users/nchan/programs/git-ws/seq2seq/bin/data/output-all//train.de & .en to /Users/nchan/programs/git-ws/seq2seq/bin/data/output-all//train.clean, cutoff 1-80, ratio 9
..........(100000)..........(200000)..........(300000)..........(400000)..........(500000)..........(600000)..........(700000)..........(800000)..........(900000)..........(1000000)..........(1100000)..........(1200000)..........(1300000)..........(1400000)..........(1500000)..........(1600000)..........(1700000)..........(1800000)..........(1900000)..........(2000000)..........(2100000)..........(2200000)..........(2300000)..........(2400000)..........(2500000)..........(2600000)..........(2700000)..........(2800000)..........(2900000)..........(3000000)..........(3100000)..........(3200000)..........(3300000)..........(3400000)..........(3500000)..........(3600000)..........(3700000)..........(3800000)..........(3900000)..........(4000000)..........(4100000)..........(4200000)..........(4300000)..........(4400000)..........(4500000)......
Input sentences: 4562102 Output sentences: 4524868
Cleaning /Users/nchan/programs/git-ws/seq2seq/bin/data/output-all//train.tok...
clean-corpus.perl: processing /Users/nchan/programs/git-ws/seq2seq/bin/data/output-all//train.tok.de & .en to /Users/nchan/programs/git-ws/seq2seq/bin/data/output-all//train.tok.clean, cutoff 1-80, ratio 9
..........(100000)..........(200000)..........(300000)..........(400000)..........(500000)..........(600000)..........(700000)..........(800000)..........(900000)..........(1000000)..........(1100000)..........(1200000)..........(1300000)..........(1400000)..........(1500000)..........(1600000)..........(1700000)..........(1800000)..........(1900000)..........(2000000)..........(2100000)..........(2200000)..........(2300000)..........(2400000)..........(2500000)..........(2600000)..........(2700000)..........(2800000)..........(2900000)..........(3000000)..........(3100000)..........(3200000)..........(3300000)..........(3400000)..........(3500000)..........(3600000)..........(3700000)..........(3800000)..........(3900000)..........(4000000)..........(4100000)..........(4200000)..........(4300000)..........(4400000)..........(4500000)......
Input sentences: 4562102 Output sentences: 4500966
Traceback (most recent call last):
File "/Users/nchan/programs/git-ws/seq2seq/bin/tools/generate_vocab.py", line 53, in
for line in fileinput.input():
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/fileinput.py", line 248, in next
line = self._readline()
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/fileinput.py", line 360, in _readline
self._file = open(self._filename, self._mode)
FileNotFoundError: [Errno 2] No such file or directory: '--max_vocab_size'
NCHAN-M-G1HR:data nchan$

What's the tool for automated code formatting?

I found that some codes are commited with messages like automated code formatting, can anyone tell me what's the tool in google for formatting python codes?

Example configurations

Writing configuration files with hyperparameters from scratch is difficult because there are lot of options. We should provide a few examples hyperparameter configurations for models. For example:

Large NMT: Configuration for large NMT/Summarization models that can get state of the art results
Medium NMT: Medium-size NMT model that does reasonable well and is faster to train than the large model
Small NMT: NMT model that is very fast to train but probably doesn't do very well.

If we decide to add image captioning support we should also add configuration files for that.

Generate Python API Documentation

We should generate a proper API documentation based on PyDoc strings. The question are:

How to make it look nice?
How to integrate it into the documentation?

Should finished #23 before doing this.

Preprocess and prepare MS COCO Dataset

We can probably use the pre-processing script from im2text. It generates SequenceExamples, which isn't great for text, but perhaps better than writing our own.

Whether to add the funtion of attention mechanism?

I cannot find some traces of attention mechanism,and I am curious about whether have added the funtion of attention mechanism in this seq2seq model framework.

Link to preprocessed data from tutorial is incomplete

In [1], linked from [2], the test sets are not subword encoded. In addition, the merges file from BPE is not included in the archive. Therefore, a model trained on the subword-encoded training set cannot be evaluated without downloading the raw data and training the BPE model from scratch.

[1] https://drive.google.com/open?id=0B_bZck-ksdkpREE3LXJlMXVLUWM
[2] https://google.github.io/seq2seq/nmt/

$ tar -tzvf wmt16_en_de.tar.gz 
-rw-r--r-- dennybritz/eng 279423 2017-03-07 04:31 vocab.bpe.32000
-rw-r--r-- dennybritz/eng 778882202 2017-03-07 04:29 train.tok.clean.bpe.32000.de
-rw-r--r-- dennybritz/eng 673102722 2017-03-07 04:21 train.tok.clean.bpe.32000.en
-rw-r--r-- dennybritz/eng    393971 2017-03-07 02:43 newstest2016.tok.de
-rw-r--r-- dennybritz/eng    354403 2017-03-07 02:44 newstest2016.tok.en
-rw-r--r-- dennybritz/eng    279171 2017-03-07 02:43 newstest2015.tok.de
-rw-r--r-- dennybritz/eng    253703 2017-03-07 02:44 newstest2015.tok.en
-rw-r--r-- dennybritz/eng    410292 2017-03-07 02:43 newstest2014.tok.de
-rw-r--r-- dennybritz/eng    377491 2017-03-07 02:44 newstest2014.tok.en
-rw-r--r-- dennybritz/eng    399549 2017-03-07 02:43 newstest2013.tok.de
-rw-r--r-- dennybritz/eng    349480 2017-03-07 02:44 newstest2013.tok.en

What's missing are the bpe.32000 merges file and newstest*.tok.bpe.32000.* subword-encoded test set files. (Not hard to regenerate, but might as well include them for future folk.)

How to develop beam search with a set of predefined responses?

In the original paper of SmartReply:

First, the elements of R (possible set of responses) are organized into a trie. Then, we conduct a left-to-right beam search, but only retain hypotheses that appear in the trie. This search process has complexity O(bl) for beam size b and maximum response length l. Both b and l are typically in the range of 10-30, so this method dramatically reduces the time to find the top responses and is a critical element of making this system deployable.

Would you please consider this feature and add a detailed task list to this issue for interested contributors.

find an error when run command:python -m unittest seq2seq.test.pipeline_test

I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla M40 24GB, pci bus id: 0000:04:00.0)
INFO:tensorflow:Restored model from /tmp/tmpzF6STH/model.ckpt-50
W tensorflow/core/framework/op_kernel.cc:993] Out of range: Reached limit of 1
[[Node: parallel_read_1/filenames/limit_epochs/CountUpTo = CountUpToT=DT_INT64, _class=["loc:@parallel_read_1/filenames/limit_epochs/epochs"], limit=1, _device="/job:localhost/replica:0/task:0/c
pu:0"]]
W tensorflow/core/framework/op_kernel.cc:993] Out of range: FIFOQueue '_35_parallel_read/common_queue' is closed and has insufficient elements (requested 1, current size 0)
[[Node: parallel_read/common_queue_Dequeue = QueueDequeueV2component_types=[DT_STRING, DT_STRING], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"]]
W tensorflow/core/framework/op_kernel.cc:993] Out of range: FIFOQueue '_35_parallel_read/common_queue' is closed and has insufficient elements (requested 1, current size 0)
[[Node: parallel_read/common_queue_Dequeue = QueueDequeueV2component_types=[DT_STRING, DT_STRING], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"]]
a a a a 泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣
泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣
a a a a a a a a a 泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣
泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣泣
E

ERROR: test_train_infer (seq2seq.test.pipeline_test.PipelineTest)
Tests training and inference scripts.

Traceback (most recent call last):
File "seq2seq/test/pipeline_test.py", line 184, in test_train_infer
infer_script.main([])
File "~/seq2seq/bin/infer.py", line 125, in main
sess.run([])
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 462, in run
run_metadata=run_metadata)
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 786, in run
run_metadata=run_metadata)
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 744, in run
return self._sess.run(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 899, in run
run_metadata=run_metadata))
File "seq2seq/tasks/dump_attention.py", line 126, in after_run
_create_figure(fetches)
File "seq2seq/tasks/dump_attention.py", line 58, in _create_figure
fig = plt.figure(figsize=(8, 8))
File "/usr/lib64/python2.7/site-packages/matplotlib/pyplot.py", line 535, in figure
**kwargs)
File "/usr/lib64/python2.7/site-packages/matplotlib/backends/backend_tkagg.py", line 81, in new_figure_manager
return new_figure_manager_given_figure(num, figure)
File "/usr/lib64/python2.7/site-packages/matplotlib/backends/backend_tkagg.py", line 89, in new_figure_manager_given_figure
window = Tk.Tk()
File "/usr/lib64/python2.7/lib-tk/Tkinter.py", line 1745, in init
self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use)
TclError: no display name and no $DISPLAY environment variable

Ran 2 tests in 25.226s

FAILED (errors=1)

ImportError: cannot import name contrib

Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/usr/lib/python2.7/unittest/main.py", line 12, in
main(module=None)
File "/usr/lib/python2.7/unittest/main.py", line 94, in init
self.parseArgs(argv)
File "/usr/lib/python2.7/unittest/main.py", line 149, in parseArgs
self.createTests()
File "/usr/lib/python2.7/unittest/main.py", line 158, in createTests
self.module)
File "/usr/lib/python2.7/unittest/loader.py", line 130, in loadTestsFromNames
suites = [self.loadTestsFromName(name, module) for name in names]
File "/usr/lib/python2.7/unittest/loader.py", line 91, in loadTestsFromName
module = import('.'.join(parts_copy))
File "seq2seq/init.py", line 24, in
from seq2seq import contrib
ImportError: cannot import name contrib
When I run"python -m unittest seq2seq.test.pipeline_test",I got this error.Could someone tell me how to solve this problem?

Run on Gigaword Summarization Task

Generate Gigaword data based on https://github.com/facebookarchive/NAMAS
Run experiments using standard train/dev/test splits
Evaluate using ROUGE script

Correcting Un-PEP263 encoding definition

Un-PEP263-like style of encoding definitions should be corrected. The python script's encoding should be magically declared in the 1st/2nd line otherwise it'll not be useful and be treated as a normal comment.

Maybe it should follow how TF's style where the encoding definition comes the license/copyrights, e.g. https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tensorboard/lib/python/http_util_test.py

For details, see http://stackoverflow.com/q/42777847/610569

Error while running python -m unittest seq2seq.test.pipeline_test

Hi I'm using Windows 10 and Python 3.5
I got an error message ImportError: cannot import name 'SecondOrStepTimer'
Can I get some help?

This is the entire error message.

ERROR: seq2seq (unittest.loader._FailedTest)

ImportError: Failed to import test module: seq2seq
Traceback (most recent call last):
File "C:\Users\Bumho\Anaconda3\lib\unittest\loader.py", line 153, in loadTestsFromName
module = import(module_name)
File "C:\Users\Bumho\seq2seq\seq2seq_init_.py", line 26, in
from seq2seq import decoders
File "C:\Users\Bumho\seq2seq\seq2seq\decoders_init_.py", line 17, in
from seq2seq.decoders.rnn_decoder import *
File "C:\Users\Bumho\seq2seq\seq2seq\decoders\rnn_decoder.py", line 32, in
from seq2seq.encoders.rnn_encoder import default_rnn_cell_params
File "C:\Users\Bumho\seq2seq\seq2seq\encoders_init.py", line 17, in
import seq2seq.encoders.rnn_encoder
File "C:\Users\Bumho\seq2seq\seq2seq\encoders\rnn_encoder.py", line 27, in
from seq2seq.training import utils as training_utils
File "C:\Users\Bumho\seq2seq\seq2seq\training_init_.py", line 17, in
from seq2seq.training import hooks
File "C:\Users\Bumho\seq2seq\seq2seq\training\hooks.py", line 28, in
from tensorflow.python.training.basic_session_run_hooks import SecondOrStepTimer # pylint: disable=E0611
ImportError: cannot import name 'SecondOrStepTimer'

How to use Multiple GPUs?

I think seq2seq training is not using multiple GPUs. The tokens/sec metric is the same as when I was training on a VM with only 1 GPU or 4 GPUs.

Can someone provide a demo of how to use 4 GPUs on a single machine? All I found in the docs was https://google.github.io/seq2seq/training/#distributed-training . That links to an example of how to use multiple devices using tf.device and how to use a cluster with tf.learn, but I couldn't figure out how to proceed with either approach. Thanks!

Running python -m bin.train as specified in https://google.github.io/seq2seq/nmt/ ...

Four devices are found (from logs):

I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 1 2 3 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y N N N 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 1:   N Y N N 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 2:   N N Y N 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 3:   N N N Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K80, pci bus id: a370:00:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:1) -> (device: 1, name: Tesla K80, pci bus id: 9f8e:00:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:2) -> (device: 2, name: Tesla K80, pci bus id: b265:00:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:3) -> (device: 3, name: Tesla K80, pci bus id: 8743:00:00.0)

Memory is allocated to all 4, but only one GPU has non-zero utilization.

$ nvidia-smi 
Tue Mar 14 19:42:15 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.26                 Driver Version: 375.26                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 8743:00:00.0     Off |                    0 |
| N/A   50C    P0    74W / 149W |  10363MiB / 11439MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K80           Off  | 9F8E:00:00.0     Off |                    0 |
| N/A   78C    P0    67W / 149W |  10363MiB / 11439MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla K80           Off  | A370:00:00.0     Off |                    0 |
| N/A   74C    P0    94W / 149W |  10402MiB / 11439MiB |     46%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla K80           Off  | B265:00:00.0     Off |                    0 |
| N/A   62C    P0    64W / 149W |  10363MiB / 11439MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

google / seq2seq Goto Github PK

seq2seq's Issues

~/desktop/code/python_program/seq2seq$ python -m unittest seq2seq.test.pipeline_test .E

ERROR: test_train_infer (seq2seq.test.pipeline_test.PipelineTest) Tests training and inference scripts.

ERROR: test_train_infer (seq2seq.test.pipeline_test.PipelineTest) Tests training and inference scripts.

This is the entire error message.

ERROR: seq2seq (unittest.loader._FailedTest)

Recommend Projects

Recommend Topics

Recommend Org

~/desktop/code/python_program/seq2seq$ python -m unittest seq2seq.test.pipeline_test
.E

ERROR: test_train_infer (seq2seq.test.pipeline_test.PipelineTest)
Tests training and inference scripts.

ERROR: test_train_infer (seq2seq.test.pipeline_test.PipelineTest)
Tests training and inference scripts.