Git Product home page Git Product logo

Comments (4)

matt-peters avatar matt-peters commented on July 2, 2024 2

The code samples don't include any additional pretrained word embeddings such as GloVe.

In our paper, we used both GloVe and ELMo and concatenated them end-to-end. The ELMo representations have dimension 1024, GloVe has dimension 50, 100, 200 or 300, so it isn't possible to concat element wise without a projection layer somewhere. Instead of introducing additional parameters for the projection, we just concatenated end-to-end. When using the 300 dimensional GloVe vectors this gave a total embedding dimension of 1324 (=1024 + 300).

from bilm-tf.

khashei avatar khashei commented on July 2, 2024

Thanks for asking this question. I had exactly the same question. I am pretty sure that the example given in the tensorflow implementation doesn't follow the paper recommendation to concatenate the input (X) to the embedding.

from bilm-tf.

khashei avatar khashei commented on July 2, 2024

Looking more into the code, the context independent embedding is already accessible as a separate operation in the model like this:
embedding_op = BidirectionalLanguageModel(options_file, weight_file)(character_ids_placeholder)
context_independent_embedding = embedding_op["token_embeddings"]

It is already projected to match the lstm dimension. It includes token for the bos and eos that has to be dropped, like this:

tf.concat([embeddings_op["token_embeddings"][:,1:-1,:], weight_layers('input', embeddings_op, l2_coef)["weighted_op"]], axis=2)

from bilm-tf.

choym0098 avatar choym0098 commented on July 2, 2024

@khashei So we literally just replace the existing word embedding(like w2v or glove) with elmo embedding (= weight_layer( ... )["weighted_op"])?

(Added)
In fact, I tried all of the three options over a sentiment analysis task(with 5 different emotions): element-wise concatenation, end-to-end concatenation, and simple replacement with elmo embedding, and only the replacement with elmo embedding outperformed the existing embedding(glove) over the training set (not over the dev set tho). So, I think you may be right!
However, tho my data was so small (300 for training, 60 for testing), so if one can clarify this, it would be great!

from bilm-tf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.