lspvic / copynet Goto Github PK
View Code? Open in Web Editor NEWCopyNet Implementation with Tensorflow and nmt
CopyNet Implementation with Tensorflow and nmt
Hi there, the code is not working when I set the parameters beam_width and num_translations_per_input to be ≥ 1.
E.g., when I set the beam_width=9, batch_size=32, the error information is shown as follows:
InvalidArgumentError (see above for traceback): Incompatible shapes: [288,1] vs. [32,11]
[[Node: dynamic_seq2seq/decoder/decoder/while/BeamSearchDecoderStep/Equal = Equal[T=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](dynamic_seq2seq/decoder/decoder/while/BeamSearchDecoderStep/ExpandDims, dynamic_seq2seq/decoder/decoder/while/BeamSearchDecoderStep/Equal/Enter)]]
I dont have any ideas about how to fix this problem, any replies will be appreciated!
Thank you!
hello,
when I set the parameters as followed:
--num_layers=1
--num_nuits=32
--share_vocab=True
--copynet=True
--gen_vocab_size=500
my gpu is 12206Mib,it returns the error:
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[100,32,367]
[[Node: dynamic_seq2seq/decoder/decoder/while/BasicDecoderStep/einsum/transpose = Transpose[T=DT_FLOAT, Tperm=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](dynamic_seq2seq/decoder/decoder/while/BasicDecoderStep/einsum/transpose/Enter, dynamic_seq2seq/decoder/decoder/while/BasicDecoderStep/einsum_4/transpose/perm)]]
[[Node: dynamic_seq2seq/decoder/decoder/while/BasicDecoderStep/TrainingHelperNextInputs/All/_131 = _HostRecvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_219_dynamic_seq2seq/decoder/decoder/while/BasicDecoderStep/TrainingHelperNextInputs/All", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
i read your Readme and see " Just wrapper an any existing rnn cell(BasicLSTMCell
, AttentionWrapper
and so on). " So i begin to try to wrap an AttentionWrapper In following ways:
first i use Attentionwrappper to wrap a BasicLSTMCell and then i get a ,lets call it attention_cell
then i use the copynetWrapper to wrap the attention_cell and i got error :
''TypeError: The two structures don't have the same nested structure.""
in CopyNetWrapperState's clone method..
when i get rid of attentionWrapper it works well...
i have no idea how to deal with this...
hope anyone implement this can teach me ~thanks!
OOV means the out of vocalbary word.
I can't find any code to handle the problem, maybe I miss some important steps?
Looking forward to your advice or answers.
Is it correct that the size of prob_c is self._encoder_state_size? I think it must be maximum sequence length.
Thanks.
I got the error when training a CopyNet model as belows:
NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:
Key dynamic_seq2seq/decoder/CopyWeight not found in checkpoint
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_INT32, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]
[[Node: save/RestoreV2/_147 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_154_save/RestoreV2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]
...
I use the CopyNetWapper to wrap a decoder, this is my code:
train_decoder = tf.contrib.seq2seq.AttentionWrapper(decoder, attention_mechanism,
attention_layer_size=self.config.PHVM_decoder_dim)
train_encoder_state = train_decoder.zero_state(self.batch_size, dtype=tf.float32).clone(
cell_state=sent_dec_state)
copynet_decoder = CopyNetWrapper(train_decoder, sent_input, sent_lens, sent_lens, self.tgt_vocab_size)
copy_train_encoder_state = copynet_decoder.zero_state(self.batch_size, dtype=tf.float32).clone(
cell_state=train_encoder_state)
However, during train I got an error:
Traceback (most recent call last):
File "/home/work/mnt/project/.local/lib/python3.6/site-packages/tensorflow/python/util/nest.py", line 297, in assert_same_structure
expand_composites)
TypeError: The two structures don't have the same nested structure.First structure: type=CopyNetWrapperState str=CopyNetWrapperState(cell_state=AttentionWrapperState(cell_state=(<tf.Tensor 'sentence_level/train/while/sent_deocde/CopyNetWrapperZeroState/AttentionWrapperZeroState/checked_cell_state:0' shape=(?, 300) dtype=float32>, <tf.Tensor 'sentence_level/train/while/sent_deocde/CopyNetWrapperZeroState/AttentionWrapperZeroState/checked_cell_state_1:0' shape=(?, 300) dtype=float32>), attention=<tf.Tensor 'sentence_level/train/while/sent_deocde/CopyNetWrapperZeroState/AttentionWrapperZeroState/zeros_2:0' shape=(?, 300) dtype=float32>, time=<tf.Tensor 'sentence_level/train/while/sent_deocde/CopyNetWrapperZeroState/AttentionWrapperZeroState/zeros_1:0' shape=() dtype=int32>, alignments=<tf.Tensor 'sentence_level/train/while/sent_deocde/CopyNetWrapperZeroState/AttentionWrapperZeroState/zeros:0' shape=(?, ?) dtype=float32>, alignment_history=(), attention_state=<tf.Tensor 'sentence_level/train/while/sent_deocde/CopyNetWrapperZeroState/AttentionWrapperZeroState/zeros_3:0' shape=(?, ?) dtype=float32>), last_ids=<tf.Tensor 'sentence_level/train/while/sent_deocde/CopyNetWrapperZeroState/sub:0' shape=(?,) dtype=int32>, prob_c=<tf.Tensor 'sentence_level/train/while/sent_deocde/CopyNetWrapperZeroState/zeros_1:0' shape=(?, ?) dtype=float32>)
Second structure: type=CopyNetWrapperState str=CopyNetWrapperState(cell_state=(<tf.Tensor 'sentence_level/train/while/sent_deocde/sent_dec_state/dense/BiasAdd:0' shape=(?, 300) dtype=float32>, <tf.Tensor 'sentence_level/train/while/sent_deocde/sent_dec_state/dense_1/BiasAdd:0' shape=(?, 300) dtype=float32>), last_ids=<tf.Tensor 'sentence_level/train/while/sent_deocde/CopyNetWrapperZeroState/sub:0' shape=(?,) dtype=int32>, prob_c=<tf.Tensor 'sentence_level/train/while/sent_deocde/CopyNetWrapperZeroState/zeros_1:0' shape=(?, ?) dtype=float32>)
More specifically: The two namedtuples don't have the same sequence type. First structure type=AttentionWrapperState str=AttentionWrapperState(cell_state=(<tf.Tensor 'sentence_level/train/while/sent_deocde/CopyNetWrapperZeroState/AttentionWrapperZeroState/checked_cell_state:0' shape=(?, 300) dtype=float32>, <tf.Tensor 'sentence_level/train/while/sent_deocde/CopyNetWrapperZeroState/AttentionWrapperZeroState/checked_cell_state_1:0' shape=(?, 300) dtype=float32>), attention=<tf.Tensor 'sentence_level/train/while/sent_deocde/CopyNetWrapperZeroState/AttentionWrapperZeroState/zeros_2:0' shape=(?, 300) dtype=float32>, time=<tf.Tensor 'sentence_level/train/while/sent_deocde/CopyNetWrapperZeroState/AttentionWrapperZeroState/zeros_1:0' shape=() dtype=int32>, alignments=<tf.Tensor 'sentence_level/train/while/sent_deocde/CopyNetWrapperZeroState/AttentionWrapperZeroState/zeros:0' shape=(?, ?) dtype=float32>, alignment_history=(), attention_state=<tf.Tensor 'sentence_level/train/while/sent_deocde/CopyNetWrapperZeroState/AttentionWrapperZeroState/zeros_3:0' shape=(?, ?) dtype=float32>) has type AttentionWrapperState, while second structure type=tuple str=(<tf.Tensor 'sentence_level/train/while/sent_deocde/sent_dec_state/dense/BiasAdd:0' shape=(?, 300) dtype=float32>, <tf.Tensor 'sentence_level/train/while/sent_deocde/sent_dec_state/dense_1/BiasAdd:0' shape=(?, 300) dtype=float32>) has type tuple
I have no idea how to solve the problem. Have someone met the same error before?
It will be very nice if you can offer me some advice.
Plus,
copynet_cell = CopyNetWrapper(cell, encoder_outputs, encoder_input_ids,
vocab_size, gen_vocab_size)
Would you please explain the parameters in detail ?
Looking forward to any reply.
I need some help,when I run the main.py, an error happen in the keras_wrapper.
C:\Users\think>python C:\Users\think\Desktop\nmt-keras-master/main.py
Using TensorFlow backend.
[09/04/2019 11:48:58] <<< Cupy not available. Using numpy. >>>
[09/04/2019 11:48:59] Running training.
[09/04/2019 11:48:59] Building EuTrans_esen dataset
Traceback (most recent call last):
File "C:\Users\think\Desktop\nmt-keras-master/main.py", line 49, in
train_model(parameters, args.dataset)
File "C:\Users\think\Desktop\nmt-keras-master\nmt_keras\training.py", line 64, in train_model
dataset = build_dataset(params)
File "C:\Users\think\Desktop\nmt-keras-master\data_engine\prepare_data.py", line 151, in build_dataset
label_smoothing=params.get('LABEL_SMOOTHING', 0.))
File "c:\users\think\src\keras-wrapper\keras_wrapper\dataset.py", line 1270, in setOutput
bpe_codes=bpe_codes, separator=separator, use_unk_class=use_unk_class)
File "c:\users\think\src\keras-wrapper\keras_wrapper\dataset.py", line 1701, in preprocessTextFeatures
'It currently is: %s' % (str(annotations_list)))
Exception: Wrong type for "annotations_list". It must be a path to a text file with the sentences or a list of sentences. It currently is: examples/EuTrans//training.en
Great job making a Copynet enabled NMT.
I noticed (sadly for my current experiment) that when running copynet enabled NMT, I have to have both target and source vocabularies with the same number of words, otherwise I get an error.
Is that a "feature" or a bug? Do I need to send you some context?
Thank you
Thanks a lot for your job about CopyNet.
I don't understand the parameters vocab_size and gen_vocab_size clearly. For example, if i have a vocabulary table contains 9999 words and a special token "UNK", that is, size of the vocabulary table is 1w. And now i have a source sentence consists of 10 words, 5 of the words are not in the vocabulary table. So does it mean the parameter vocab_size is 10005(or 10010?) and gen_vocab_size is 1w? If so, when i use the CopyNetWrapper cell, should i calculate the maximum length of input sentences as a parameter of vocab_size?
Thanks again
Sam
Hi all,
I would like to know how to adjust the parameter of CopyNet network to control whether or not copy the source words during decoder. Where is the the source code of CopyNet in official nmt?
Thank you !
Best Regards,
Connie
It works perfectly fine with the Greedy decoder. Here is the code
Tensorflow: 1.8.0
encoder_emb_inp = tf.nn.embedding_lookup(embeddings, x)
encoder_cell = rnn.GRUCell(rnn_size,name='encoder')
encoder_outputs, encoder_state= tf.nn.dynamic_rnn(encoder_cell,encoder_emb_inp,sequence_length=len_docs,dtype=tf.float32)
tiled_encoder_outputs = tf.contrib.seq2seq.tile_batch(encoder_outputs, multiplier=beam_width)
tiled_sequence_length = tf.contrib.seq2seq.tile_batch(len_docs, multiplier=beam_width)
tiled_encoder_final_state = tf.contrib.seq2seq.tile_batch(encoder_state, multiplier=beam_width)
tiled_t = tf.contrib.seq2seq.tile_batch(t,multiplier=beam_width)
start_tokens = tf.constant(word2int['SOS'], shape=[batch_size])
decoder_cell = rnn.GRUCell(rnn_size,name='decoder')
attention_mechanism = tf.contrib.seq2seq.LuongAttention(rnn_size,tiled_encoder_outputs,memory_sequence_length=tiled_sequence_length)
decoder_cell = tf.contrib.seq2seq.AttentionWrapper(decoder_cell, attention_mechanism,attention_layer_size=rnn_size)
initial_state = decoder_cell.zero_state(batch_size*beam_width, dtype=tf.float32).clone(cell_state=tiled_encoder_final_state)
decoder_cell = CopyNetWrapper(decoder_cell, tiled_encoder_outputs, tiled_t,len(set(delta).union(words)),vocab_size)
initial_state = decoder_cell.zero_state(batch_size*beam_width, dtype=tf.float32).clone(cell_state=initial_state)
tf.contrib.seq2seq.BeamSearchDecoder(cell=decoder_cell,embedding=embeddings,start_tokens=start_tokens,end_token=word2int['EOS'],initial_state=initial_state,beam_width=beam_width,output_layer=None,length_penalty_weight=0.0)
outputs,_,_ = tf.contrib.seq2seq.dynamic_decode(decoder)
ERROR is :
File "/home/usr/.local/lib/python2.7/site-packages/tensorflow/contrib/seq2seq/python/ops/beam_search_decoder.py", line 531, in _split_batch_beams
reshaped_t.set_shape(expected_reshaped_shape)
File "/home/usr/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 538, in set_shape
raise ValueError(str(e))
ValueError: Dimension 2 in both shapes must be equal, but are 38253 and 4. Shapes are [1,1,38253] and [1,1,4].
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.