As described in the paper "deep contextualized word representations", before being fed

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

How to concatenate a ELMo vector with the corresponding context-independent token representation? about bilm-tf HOT 4 CLOSED

allenai commented on July 2, 2024

How to concatenate a ELMo vector with the corresponding context-independent token representation?

from bilm-tf.

Comments (4)

matt-peters commented on July 2, 2024 2

The code samples don't include any additional pretrained word embeddings such as GloVe.

In our paper, we used both GloVe and ELMo and concatenated them end-to-end. The ELMo representations have dimension 1024, GloVe has dimension 50, 100, 200 or 300, so it isn't possible to concat element wise without a projection layer somewhere. Instead of introducing additional parameters for the projection, we just concatenated end-to-end. When using the 300 dimensional GloVe vectors this gave a total embedding dimension of 1324 (=1024 + 300).

from bilm-tf.

khashei commented on July 2, 2024

Thanks for asking this question. I had exactly the same question. I am pretty sure that the example given in the tensorflow implementation doesn't follow the paper recommendation to concatenate the input (X) to the embedding.

from bilm-tf.

khashei commented on July 2, 2024

Looking more into the code, the context independent embedding is already accessible as a separate operation in the model like this:
embedding_op = BidirectionalLanguageModel(options_file, weight_file)(character_ids_placeholder)
context_independent_embedding = embedding_op["token_embeddings"]

It is already projected to match the lstm dimension. It includes token for the bos and eos that has to be dropped, like this:

tf.concat([embeddings_op["token_embeddings"][:,1:-1,:], weight_layers('input', embeddings_op, l2_coef)["weighted_op"]], axis=2)

from bilm-tf.

choym0098 commented on July 2, 2024

@khashei So we literally just replace the existing word embedding(like w2v or glove) with elmo embedding (= weight_layer( ... )["weighted_op"])?

(Added)
In fact, I tried all of the three options over a sentiment analysis task(with 5 different emotions): element-wise concatenation, end-to-end concatenation, and simple replacement with elmo embedding, and only the replacement with elmo embedding outperformed the existing embedding(glove) over the training set (not over the dev set tho). So, I think you may be right!
However, tho my data was so small (300 for training, 60 for testing), so if one can clarify this, it would be great!

from bilm-tf.

Recommend Projects

How to concatenate a ELMo vector with the corresponding context-independent token representation? about bilm-tf HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent