Comments (8)
But seems for decoding tensorflow lack some thing mentioned in "On Using Very Large Target Vocaulary for Neural Machine Translation". You have to calc for every word in vocabulary.
I think the vocabulary issue has kind of been "solved" by using subword units / word pieces, for example.
You typically only need 16k-32k word pieces to cover almost all of the vocabulary.
As for the original question. It's an interesting feature, but it doesn't seem very common to me as it needs a set of predefined responses, which you have for very few tasks. It seem very specific to response retrieval (not generation). I think a large refactoring of the beam search may be necessary to support this.
from seq2seq.
Agree with @amirj this is interesting feature, and we might add more features so as to improve decoding speed.
I think trie based beam search will not improve speed much for general purpose, it is used mostly for selecting responses from a candidate set rather then generate sequence each step choosing one word from full vocabulary, so this can ensure response quality, the performance improve is compared with scoring every candidates then sort to select the most high probability responses.
So it is used in specific scene, and it is hard to implement with in graph beam search(may be need to write c++ op for generating candidates words each step?), out graph approach like im2txt did might be simpler.
For improving decoding, beam search speed, vocabulary size is the most key thing.
since something like below is costly for large vocabulary, especially xw_plus_b
logits = tf.nn.xw_plus_b(output, self.w, self.v)
logprobs = tf.nn.log_softmax(logits)
You have to calc for every word in vocabulary it's output, so to get the final probability.
When training tensorflow has provided sampled_softmax_loss to improve performance a lot.
But seems for decoding tensorflow lack some thing mentioned in "On Using Very Large Target Vocaulary for Neural Machine Translation". You have to calc for every word in vocabulary.
May be for trie based method you can only consider small vocabulary for each step but doing that you can not get exact probability at each step.
Another approach is to use self-normalization to avoid cost calculations for each word in vocab, but you need to change the training cost function.
http://sebastianruder.com/word-embeddings-softmax/index.html#selfnormalisation
from seq2seq.
@dennybritz Thank you.
from seq2seq.
So it is used in specific scene, and it is hard to implement with in graph beam search(may be need to write c++ op for generating candidates words each step?), out graph approach like im2txt did might be simpler.
@chenghuige would you please elaborate more on how to developing it out graph?
im2txt leveraged out graph approach?
from seq2seq.
@amirj im2txt\inference_utils\caption_generator.py, here im2txt does out graph beam search, each step by sess.run(),
softmax, new_states, metadata = self.model.inference_step(sess,
input_feed,
state_feed)
but I think transfering softmax(big data) each time might make inference really slow.. any way you can fully control the process do beam search with trie.
from seq2seq.
I have a similar feature request as this.(tensorflow/tensorflow#11602) At first glance it might very specific. But I think its very useful to determine scores for various output sequences.
from seq2seq.
Have you solved this problem? I met the same problem and have no idea how to put a trie data structure into TensorFlow.
from seq2seq.
Have you solved this problem? I also met the same problem and have no idea how to put a trie data structure into TensorFlow.
from seq2seq.
Related Issues (20)
- speeding up inference nmt chatbot nlp
- InvalidArgumentError, Found Inf or NaN gradient(global norm). HOT 2
- Invalid argument: No OpKernel was registered to support Op 'PyFunc' HOT 4
- ValueError: Can not provide both every_secs and every_steps
- seq2seq checkpoint restore for transfer learning
- num_units is not a valid argument for BasicLSTMCell class tf 1.14 HOT 3
- KeyErrors when running pipeline test HOT 8
- Fix Google seq2seq Installation Errors
- AttributeError: module 'tensorflow.python.platform.flags' has no attribute '_FlagValues' HOT 4
- Error while executing
- tensorflow.python.framework.errors_impl.NotFoundError : Key not found HOT 2
- Error while making predictions (Testing).
- Deprecate non-standard BLEU scripts
- How to build a character based seq2seq tensorflow model for spell correction?
- Error On Setup HOT 1
- WMT 2016 En-De Download Link is broken HOT 1
- python -m unittest seq2seq.test.pipeline_test -> ModuleNotFoundError: No module named 'seq2seq' HOT 2
- ModuleNotFoundError: No module named 'tensorflow.contrib' HOT 2
- ModuleNotFoundError: No module named 'tensorflow' HOT 1
- Can I decode embedings to sequences using seq2seq? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from seq2seq.