Comments (6)
-
Where is the pre-trained model for Word2Vec?
-
Error in Word2Vec.py
File "word2vec.py", line 178, in <module>
corpus = createCorpus(data)
NameError: name 'data' is not defined
from bidirectiona-lstm-for-text-summarization-.
Hi @sanjayb678 , I wrote and ran the whole script in spyder(python 3.6). i would advise you first keep the same configuration and I did not test is the code works exactly the same in a notebook. saving shouldn't be a problem as far as I know. however you can skip over this line as long as the model is in memory.
from bidirectiona-lstm-for-text-summarization-.
why have you done
label_encoder,onehot_encoded,onehot=summonehot(data["summaries"])
shouldn't the function argument by corpus instead of data["summaries"]??
from bidirectiona-lstm-for-text-summarization-.
cnn_daily_load.py you too create function like this,
def cnn_daily_load():
filenames=load_data(datasets["cnn"],data_categories[0])
"""----------load the data, sentences and summaries-----------"""
for k in range(len(filenames[:400])):
if k%2==0:
try:
data["articles"].append(cleantext(parsetext(datasets["cnn"],data_categories[0],"%s"%filenames[k])))
except Exception as e:
data["articles"].append("Could not read")
print(e)
else:
try:
data["summaries"].append(cleantext(parsetext(datasets["cnn"],data_categories[0],"%s"%filenames[k])))
except Exception as e:
data["summaries"].append("Could not read")
print(e)
return data
then simply import cnn_daily_load.py to word2vec.py
from cnn_daily_load import cnn_daily_load, cleantext
data = cnn_daily_load()
your first question is Where is the pre-trained model for Word2Vec?
i think,
Actually simply we are using skipgram model algorithm to generate own word embeddings.
that is why we no need for word2vec pre-trained model. it is another way of generating word embedding.
from bidirectiona-lstm-for-text-summarization-.
from bidirectiona-lstm-for-text-summarization-.
not really @DeepsMoseli. in this place you are using gensim - skipgram algorithm(word2vec) to build normal word2vec model then generating embedding for the words. training from scratch.
great stuff. we never used word2vec pretrained model here.
@amanjaswani i was not understood your question, but give you a hand.
label_encoder,onehot_encoded,onehot=summonehot(data["summaries"])
label_encoder for training label.
word2vec embedding for training data.
from bidirectiona-lstm-for-text-summarization-.
Related Issues (10)
- cnn_daily_load.py HOT 2
- Lstm_atttention.py : The last dimension of the inputs to a Dense layer should be defined. Found None. Full input shape received: (None, 128, None)
- Lstm_attention.py HOT 6
- Can you talk about the running environment of the code, such as tensorflow version python version, etc.
- TypeError: list indices must be integers or slices, not tuple. HOT 3
- Getting the same result for all articles ! HOT 2
- lstm_attention.py HOT 1
- How to find word from vector in gensim word2vec HOT 1
- Data not found
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bidirectiona-lstm-for-text-summarization-.