I load pretrained chinese model and predict semantic similarity. then I get the follow

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

get error when predict about bert-for-tf2 HOT 4 CLOSED

kpe commented on August 14, 2024

get error when predict

from bert-for-tf2.

Comments (4)

kpe commented on August 14, 2024

@breadbread1984 Your code above looks OK to me (keep in mind the freeze_bert_layers will not freeze the LayerNorm layers, which is needed for adapter-BERT).
How do you call model.predict?

from bert-for-tf2.

breadbread1984 commented on August 14, 2024

I just tokenize the input string pair and feed into bert.
my code is here
the bert is called at here
the preprocess code was copied from run_classifier.py from google official bert project.
really appreciate for your help.

from bert-for-tf2.

kpe commented on August 14, 2024

@breadbread1984 - I think, it doesn't work because, there is no Model in your _classify() function. Keras needs a Model instance to wire a TF graph. You have built a model in BERT.py (e.g. calling model.build()), but you have a classifier (dropout and dense layers belonging to no model) in _classify() - and they are not wired into the execution graph. For this you either need a separate Model calling the bert-Model, or as an alternative, you can return the bert layer from BERT.py, and build a single classifier model to be used in _classify() (i.e. calling model.build() in Predictor only). Does this explanation make sense?

To make it work, you could change your code like this:

diff --git a/BERT.py b/BERT.py
index 722d4bb..1a3b7f4 100644
--- a/BERT.py
+++ b/BERT.py
@@ -39,7 +39,7 @@ def BERT(max_seq_len = 128, bert_model_dir = 'models/chinese_L-12_H-768_A-12', d
     output = bert([input_token_ids, input_segment_ids]);
     # create model containing only bert layer
     model = tf.keras.Model(inputs = [input_token_ids, input_segment_ids], outputs = output);
-    model.build(input_shape = [(None, max_seq_len), (None, max_seq_len)]);
+    #model.build(input_shape = [(None, max_seq_len), (None, max_seq_len)]);
     # freeze_bert_layers
     freeze_bert_layers(bert);
     # load bert layer weights
diff --git a/Predictor.py b/Predictor.py
index 2c7e6da..dd1dcfe 100644
--- a/Predictor.py
+++ b/Predictor.py
@@ -151,15 +151,20 @@ class Predictor(object):
     def _classify(self, inputs, mask, training = None):
 
         # the first element of output sequence.
-        outputs = self.bert(inputs, mask, training);
+        outputs = self.bert.predict(inputs)
+
+        cls_input = tf.keras.Input((128,768,), dtype=tf.float32)
         # first_token.shape = (batch, hidden_size)
-        first_token = tf.keras.layers.Lambda(lambda seq: seq[:, 0, :])(outputs);
-        first_token = tf.keras.Dropout(rate = 0.5)(first_token);
+        first_token = tf.keras.layers.Lambda(lambda seq: seq[:, 0, :])(cls_input);
+        first_token = tf.keras.layers.Dropout(rate = 0.5)(first_token);
         pooled_output = tf.keras.layers.Dense(units = first_token.shape[-1], activation = tf.math.tanh)(first_token);
         dropout = tf.keras.layers.Dropout(rate = 0.5)(pooled_output);
         logits = tf.keras.layers.Dense(units = 2, activation = tf.nn.softmax)(dropout);
 
-        return logits;
+        model = tf.keras.models.Model(inputs=cls_input, outputs=logits)
+        model.build(input_shape = (None, 128, 768))
+
+        return model.predict(outputs)
 
     def predict(self, question, answer):
 
@@ -167,6 +172,10 @@ class Predictor(object):
         input_ids = tf.constant(input_ids, dtype = tf.int32);
         input_mask = tf.constant(input_mask, dtype = tf.int32);
         segment_ids = tf.constant(segment_ids, dtype = tf.int32);
+
+        input_ids = tf.expand_dims(input_ids, axis=0)
+        segment_ids = tf.expand_dims(segment_ids, axis=0)
+        
         logits = self._classify([input_ids, segment_ids], input_mask, False);
         probabilities = tf.nn.softmax(logits);
         out = tf.math.argmax(probabilities);
@@ -174,6 +183,7 @@ class Predictor(object):
 
 if __name__ == "__main__":
 
+    tf.enable_eager_execution()
     assert tf.executing_eagerly();
     predictor = Predictor();
     print(predictor.predict('今天天气如何？','感觉很不错！'));

but this is not what you want, as it first predicts on the BERT model, and then feeds its output to a second classifier model. A single classifier model with a bert layer would be be better (and easier to train).

Also note, that you need the batch dimension, when feeding data into the model (therefore the call to tf.expand_dims() above).

from bert-for-tf2.

breadbread1984 commented on August 14, 2024

got it. thx for your informative and kindly reply!

from bert-for-tf2.

get error when predict about bert-for-tf2 HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent