I am attempting to get a pre-trained BERT layer working in TF 2.0. Said differently, I

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Problems loading the readme about bert-for-tf2 HOT 2 CLOSED

kpe commented on August 14, 2024

Problems loading the readme

from bert-for-tf2.

Comments (2)

kpe commented on August 14, 2024

@ralphbrooks - thank you for noting that. I think the example in the README was somewhat short. The model variables in Keras seems to be created only after the model has been build, i.e. keras needs to know some of the dimensions of the input to properly calculate the dimensions of the weights. So, a more complete example might look like this (note the call to Model.build):

        print(model_dir)

        bert_config_file = os.path.join(model_dir, "bert_config.json")
        bert_ckpt_file   = os.path.join(model_dir, "bert_model.ckpt")

        with tf.io.gfile.GFile(bert_config_file, "r") as reader:
            stock_params = StockBertConfig.from_json_string(reader.read())
            bert_params  = stock_params.to_bert_model_layer_params()

        l_bert = BertModelLayer.from_params(bert_params, name="bert")

        max_seq_len=128
        l_input_ids      = keras.layers.Input(shape=(max_seq_len,), dtype='int32', name="input_ids")
        l_token_type_ids = keras.layers.Input(shape=(max_seq_len,), dtype='int32', name="token_type_ids")
        output = l_bert([l_input_ids, l_token_type_ids])

        model = keras.Model(inputs=[l_input_ids, l_token_type_ids], outputs=output)
        model.build(input_shape=[(None, max_seq_len),
                                 (None, max_seq_len)])

        load_stock_weights(l_bert, bert_ckpt_file)