How would you approach named-entity recognition with this library?

Named-entity recognition about bert-for-tf2 HOT 4 OPEN

kpe commented on August 14, 2024

Named-entity recognition

from bert-for-tf2.

Comments (4)

atreyasha commented on August 14, 2024

I am working on a similar sequence tagging task for argument candidate identification. Essentially BERT or ALBERT would perform the encoding aspect of the raw input. Then, you would need a layer on top of BERT|ALBERT to decode the representations to the desired target.

I would essentially follow this example here: https://github.com/kpe/bert-for-tf2/blob/master/examples/gpu_movie_reviews.ipynb

Under create_model, you would need to modify the layers after the BERT|ALBERT layer to map to your output sequence dimension. I will probably do this task in another repo and can post some results soon.

@kpe you mentioned in #30 to ignore the activations of the padding in the output layer, would you also suggest doing this for a sequence tagging task? If so, how would you propose doing this in the output layer?

Also, thank you for this awesome repo. Minor issue though: under NEWS on the readme, I think the first entry should be 6th Jan 2020. Just a minor thing, no biggie :)

from bert-for-tf2.

harrystuart commented on August 14, 2024

Any update on NER tasks with this library?

from bert-for-tf2.

yingchengsun commented on August 14, 2024

If there is a NER example with this library, that will be very helpful!

from bert-for-tf2.

ptamas88 commented on August 14, 2024

Hi,
As I managed to use this library for NER task i am happy to share my experiences.
Sorry, but I can't share the whole code, but trying to explain the key parts.

The input text is tokenized by the tokenizer module and padded to a specified max lenght (in my case 200 tokens at max)
For each token the output tags are transformed into a one-hot vector and if the tokenizer broke up one word into multiple tokens then I used the belonging tag for the first token and [MASK] for the remaining part of the original word
So I have X sentences in the trainign set, then the input shape is (X,200) hence 200 is the padded lenght of each sentences. In this case the output shape is (X,200,NUMBER_OF_TAGS). NUMBER_OF_TAGS is the number of your entity types, depends of whether you use BIOE, or just BIO, and here you add the special tokens: [CLS], [PAD], [MASK]. In my case here are the tags:
['B-ORG', 'I-ORG', 'B-MISC', 'I-MISC', 'B-LOC', 'I-LOC', 'B-PER', 'I-PER', 'O', '[CLS]', '[MASK]', '[PAD]'].
This way my shapes are (X,200) and (X,200,12)
load the Bert model the same way as in the calssification example but here we will use a different model architecture for the remaining layers, hence it is not just a classification. This is basically the example codes of the packages description with a little tweak:

bert_layer = bert_tf2.BertModelLayer.from_params(bert_params, name="bert")

input = tf.keras.layers.Input(shape=(200))
output = bert_layer(input)
output = tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(units=12, activation='softmax'))(output)
model = tf.keras.models.Model(inputs=input, outputs=output)

model.build(input_shape=(200))

bert_layer.apply_adapter_freeze()
bert_layer.embeddings_layer.trainable=False

The magic here is the TimeDistributed wrapper layer.
My results:
After just 1 epoch on 29k trainign sentences:
loss: 0.0227 - categorical_accuracy: 0.9933 - val_loss: 0.0042 - val_categorical_accuracy: 0.9988

So basically, that's it folks :)

from bert-for-tf2.

Named-entity recognition about bert-for-tf2 HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent