Comments (4)
I am working on a similar sequence tagging task for argument candidate identification. Essentially BERT or ALBERT would perform the encoding aspect of the raw input. Then, you would need a layer on top of BERT|ALBERT to decode the representations to the desired target.
I would essentially follow this example here: https://github.com/kpe/bert-for-tf2/blob/master/examples/gpu_movie_reviews.ipynb
Under create_model
, you would need to modify the layers after the BERT|ALBERT layer to map to your output sequence dimension. I will probably do this task in another repo and can post some results soon.
@kpe you mentioned in #30 to ignore the activations of the padding in the output layer, would you also suggest doing this for a sequence tagging task? If so, how would you propose doing this in the output layer?
Also, thank you for this awesome repo. Minor issue though: under NEWS on the readme, I think the first entry should be 6th Jan 2020. Just a minor thing, no biggie :)
from bert-for-tf2.
Any update on NER tasks with this library?
from bert-for-tf2.
If there is a NER example with this library, that will be very helpful!
from bert-for-tf2.
Hi,
As I managed to use this library for NER task i am happy to share my experiences.
Sorry, but I can't share the whole code, but trying to explain the key parts.
- The input text is tokenized by the tokenizer module and padded to a specified max lenght (in my case 200 tokens at max)
- For each token the output tags are transformed into a one-hot vector and if the tokenizer broke up one word into multiple tokens then I used the belonging tag for the first token and [MASK] for the remaining part of the original word
- So I have X sentences in the trainign set, then the input shape is (X,200) hence 200 is the padded lenght of each sentences. In this case the output shape is (X,200,NUMBER_OF_TAGS). NUMBER_OF_TAGS is the number of your entity types, depends of whether you use BIOE, or just BIO, and here you add the special tokens: [CLS], [PAD], [MASK]. In my case here are the tags:
['B-ORG', 'I-ORG', 'B-MISC', 'I-MISC', 'B-LOC', 'I-LOC', 'B-PER', 'I-PER', 'O', '[CLS]', '[MASK]', '[PAD]'].
This way my shapes are (X,200) and (X,200,12) - load the Bert model the same way as in the calssification example but here we will use a different model architecture for the remaining layers, hence it is not just a classification. This is basically the example codes of the packages description with a little tweak:
bert_layer = bert_tf2.BertModelLayer.from_params(bert_params, name="bert")
input = tf.keras.layers.Input(shape=(200))
output = bert_layer(input)
output = tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(units=12, activation='softmax'))(output)
model = tf.keras.models.Model(inputs=input, outputs=output)
model.build(input_shape=(200))
bert_layer.apply_adapter_freeze()
bert_layer.embeddings_layer.trainable=False
The magic here is the TimeDistributed wrapper layer.
My results:
After just 1 epoch on 29k trainign sentences:
loss: 0.0227 - categorical_accuracy: 0.9933 - val_loss: 0.0042 - val_categorical_accuracy: 0.9988
So basically, that's it folks :)
from bert-for-tf2.
Related Issues (20)
- Custom tokenizer layer HOT 5
- ResourceExhaustedError: OOM when allocating tensor with shape[501153,768] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [Op:Mul]
- mixed precision HOT 3
- example (gpu_movie_reviews) has some mistake
- Failed to get weights from pretrained google model HOT 2
- Can not load pretrained bert weights when loading chinese_L-12_H-768_A-12/bert_model.ckpt HOT 3
- Paddings must be non-negative
- albert classification error(Failed copying input tensor from GPU in order to run Identity: GPU sync failed [Op:Identity])
- ValueError: Found unexpected keys that do not correspond to any Model output
- More comments for the code
- Can't train BERT with loaded weights on QA Task HOT 3
- Setting unexpected parameter 'name' in Params instance 'Params' HOT 2
- how to using this in functional model
- may be there is some problem work with tf hub
- AttributeError: module 'bert' has no attribute 'Layer'
- type error HOT 5
- Activation after bert-layer differs
- Count of weight not found[196]
- OSS License compatibility question
- tensorflow.python.keras.layer.input_spec should be replaced with tensorflow.keras.layers.InputSpec HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bert-for-tf2.