Hi there, on the BERT for Patents "saved model", how can we access t

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

BERT for Patents: unable to access hidden layers about patents-public-data HOT 6 CLOSED

google commented on May 3, 2024

BERT for Patents: unable to access hidden layers

from patents-public-data.

Comments (6)

benzitohhh commented on May 3, 2024

Ah we managed to access the hidden layers by explicitly loading the checkpoint - thanks!

from patents-public-data.

amirbar commented on May 3, 2024

@benzitohhh could you please provide an example for this?

from patents-public-data.

benzitohhh commented on May 3, 2024

@amirbar

Here's an example using bert-for-tf2 (https://github.com/kpe/bert-for-tf2)

It loads the checkpoint (https://storage.googleapis.com/patents-public-data-github/checkpoint.zip).

out_layer_ndxs can be set to access whatever layers you need. So below we access the penultimate layer (-2).

import bert
from tensorflow import keras
import tensorflow as tf

MODEL_DIR = "./bert_model" // the downloaded checkpoint
MAX_SEQ_LEN = 256

l_input_ids = keras.layers.Input(shape=(MAX_SEQ_LEN,), dtype='int32')

bert_params = bert.params_from_pretrained_ckpt(MODEL_DIR)
bert_params.out_layer_ndxs = [-2]

l_bert = bert.BertModelLayer.from_params(bert_params, name="bert")
output = l_bert(l_input_ids)
model = keras.Model(inputs=l_input_ids, outputs=output)
model.build(input_shape=(None, MAX_SEQ_LEN))

# A simple input
texts           = ["<CLS>", "The", "quick", "brown", "fox",  "<SEP>"]
input_ids      = [ 2,             1661,   3913,     2494,     4084,  3      ]
padding        = [0] * (MAX_SEQ_LEN - len(input_ids))
input_ids_padded = input_ids + padding

result  = model.predict([input_ids_padded])

from patents-public-data.

avivihadar commented on May 3, 2024

Hi @benzitohhh, thanks for providing the example. However, this seems to be missing the actual line for loading the weights (should be called after model.build):
bert.load_bert_weights(l_bert, model_ckpt)

However, the cls token weights do not seem to be defined in the target model. Did you attempt to redefine them on TF2?

from patents-public-data.

benzitohhh commented on May 3, 2024

@avivihadar Ah thanks so much. You're totally right, I forgot to actually load the weights. Oooops :)

Maybe I'm misunderstanding, but the CLS token is just the first element in each sequence. So the weights should be there.

We actually have mainly been using bert-as-a-service (although it uses tensorflow1):
https://github.com/hanxiao/bert-as-service

from patents-public-data.

benzitohhh commented on May 3, 2024

@avivihadar Oh wait, by CLS token, you mean the logits? Yeah we're not using that, we're just interested in getting a patent embedding (vectorisation).

from patents-public-data.

Recommend Projects

BERT for Patents: unable to access hidden layers about patents-public-data HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent