Git Product home page Git Product logo

Comments (6)

PratikNalage avatar PratikNalage commented on June 11, 2024 1
  1. Where is the pre-trained model for Word2Vec?

  2. Error in Word2Vec.py

  File "word2vec.py", line 178, in <module>
    corpus = createCorpus(data)
NameError: name 'data' is not defined

from bidirectiona-lstm-for-text-summarization-.

DeepsMoseli avatar DeepsMoseli commented on June 11, 2024

Hi @sanjayb678 , I wrote and ran the whole script in spyder(python 3.6). i would advise you first keep the same configuration and I did not test is the code works exactly the same in a notebook. saving shouldn't be a problem as far as I know. however you can skip over this line as long as the model is in memory.

from bidirectiona-lstm-for-text-summarization-.

amanjaswani avatar amanjaswani commented on June 11, 2024

why have you done
label_encoder,onehot_encoded,onehot=summonehot(data["summaries"])

shouldn't the function argument by corpus instead of data["summaries"]??

from bidirectiona-lstm-for-text-summarization-.

MuruganR96 avatar MuruganR96 commented on June 11, 2024

@PratikNalage

cnn_daily_load.py you too create function like this,

def cnn_daily_load():
    filenames=load_data(datasets["cnn"],data_categories[0])

    """----------load the data, sentences and summaries-----------"""
    for k in range(len(filenames[:400])):
            if k%2==0:
                try:
                    data["articles"].append(cleantext(parsetext(datasets["cnn"],data_categories[0],"%s"%filenames[k])))
                except Exception as e:
                    data["articles"].append("Could not read")
                    print(e)
            else:
                try:
                    data["summaries"].append(cleantext(parsetext(datasets["cnn"],data_categories[0],"%s"%filenames[k])))
                except Exception as e:
                    data["summaries"].append("Could not read")
                    print(e)
    return data

then simply import cnn_daily_load.py to word2vec.py

from cnn_daily_load import cnn_daily_load, cleantext
data = cnn_daily_load()

your first question is Where is the pre-trained model for Word2Vec?

i think,

Actually simply we are using skipgram model algorithm to generate own word embeddings.
that is why we no need for word2vec pre-trained model. it is another way of generating word embedding.

from bidirectiona-lstm-for-text-summarization-.

DeepsMoseli avatar DeepsMoseli commented on June 11, 2024

from bidirectiona-lstm-for-text-summarization-.

MuruganR96 avatar MuruganR96 commented on June 11, 2024

not really @DeepsMoseli. in this place you are using gensim - skipgram algorithm(word2vec) to build normal word2vec model then generating embedding for the words. training from scratch.

great stuff. we never used word2vec pretrained model here.

@amanjaswani i was not understood your question, but give you a hand.

label_encoder,onehot_encoded,onehot=summonehot(data["summaries"])

label_encoder for training label.
word2vec embedding for training data.

from bidirectiona-lstm-for-text-summarization-.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.