aylien / docnade Goto Github PK
View Code? Open in Web Editor NEWTensorFlow implementation of DocNADE
TensorFlow implementation of DocNADE
Hi.
First of all. I hav read your articles and find it to be well written. I faced some issues when training the model. I have split my datasets of 900 articles into the 3 csv files needed. After which i ran the preprocess python file and then ran the training file. This where i faced some issues. It seems that the training process is stuck in an infinite loop. The number of iterations is 10000+ and counting. With the output of every 50 iterations being 0 and 0. I would like to seek your advise on could there be potentially anything wrong with my implementation?
Hello!
I have found a problem in preprocess.py. As you can see, the id of the first word from .vocab file will be equal to 0.
def main(args):
...
with open(args.vocab, 'r') as f:
vocab = [w.strip() for w in f.readlines()]
vocab_to_id = dict(zip(vocab, range(len(vocab))))
Now lets take a look at method preprocess
def preprocess(text, vocab_to_id):
ids = [vocab_to_id.get(x) for x in tokens(text) if vocab_to_id.get(x)]
It looks like in list generation construction you wanted to check if vocab_to_id.get(x) is not None, but unfortunately integer value 0 will be automatically converted to False too, so words with id=0 will be ignored.
Example:
example.vocab
a
b
input/training.csv
mark, a a a a b b b b
Produced output/training
0,1 1
Thank you!
Hey, I'm just having a look through the code, and I was wondering why in preprocess.py documents are transformed into their log counts and then serialized?
The DocNADE model is also generative. Can you include the generative model?
hello,best_val = 0,but the val of every epoch is 0.So,the model can't save,could you please tell me why is that?
Can this code be extended to language model as described in the paper? How do we initialize the value of U and Weight Matrix W (lm) in that case?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.