bojone / bert4keras Goto Github PK
View Code? Open in Web Editor NEWkeras implement of transformers for humans
Home Page: https://kexue.fm/archives/6915
License: Apache License 2.0
keras implement of transformers for humans
Home Page: https://kexue.fm/archives/6915
License: Apache License 2.0
看bert_keras代码,最后一层只是全连接层(dense),问下需不需要加dropout或l2正则化?是只加在最后全连接层还是前面的bert model的那些层微调时都可以加正则化?
在运行以下代码时,发现bert4keras会报错,keras_bert正常。以下是代码:
print('build bert model...')
bert_model = load_pretrained_model(config_path, checkpoint_path)
#替换成以下两行代码好使
#from keras_bert import load_trained_model_from_checkpoint
#bert_model = load_trained_model_from_checkpoint(config_path,checkpoint_path)
x1_input = Input(shape=(maxlen,), dtype='int32')
x2_input = Input(shape=(maxlen,), dtype='int32')
bert_output_layer = bert_model([x1_input, x2_input])
cls_output = Lambda(lambda x: x[:, 0])(bert_output_layer)
output = Dense(1, activation='sigmoid')(cls_output)
model = Model([x1_input, x2_input], output)
model.summary()
adam = Adam(lr=1e-4, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(loss='binary_crossentropy', optimizer=adam, metrics=['accuracy'])
model.fit([word_index, sent_index], y_train, shuffle=True, batch_size=128, epochs=epochs, validation_split=0.1)
以下是错误:
InvalidArgumentError: You must feed a value for placeholder tensor 'Input-Token_3' with dtype float and shape [?,?]
[[{{node Input-Token_3}} = Placeholderdtype=DT_FLOAT, shape=[?,?], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
想问一下是什么原因呢?
simpletokenizer中encode方法
def encode(self, first, second=None, first_length=None):
"""输出文本对应token id和segment id
如果传入first_length,则强行padding第一个句子到指定长度
"""
token_ids, segment_ids = [], []
token_ids.extend([self._token_dict[c] for c in self.tokenize(first)])
segment_ids.extend([0] * (len(first) + 2))
if first_length is not None and len(token_ids) < first_length + 2:
token_ids.extend([0] * (first_length + 2 - len(token_ids)))
segment_ids.extend([0] * (first_length + 2 - len(segment_ids)))
if second is not None:
token_ids.extend([
self._token_dict[c]
for c in self.tokenize(second, add_cls=False)
])
segment_ids.extend([1] * (len(second) + 1))
return token_ids, segment_ids
我查阅了原始bert fine-tuning的代码, 如果first+second超过最大长度,会调成first和second变成等长。 而上面是处理first长度,没处理second长度, 如果batch和batch之间数据长度不一样会报错。 尝试改成了first+second == 512-3能成功跑起来
請問可以解釋一下 learning_rate, weight_decay, lr_schedule之間的關係嗎?
假設
num_train_steps = 100
num_warmup_steps = 10
learning_rate = 0.01
weight_decay_rate = 0.05
lr_schedule = {
num_warmup_steps : 0.99,
num_train_steps: 0.01,
}
optimizer = extend_with_weight_decay(Adam)
optimizer = extend_with_piecewise_linear_lr(optimizer)
optimizer_params = {
'learning_rate': learning_rate,
'lr_schedule': lr_schedule,
'weight_decay_rate': weight_decay_rate,
}
optimizer = optimizer(**optimizer_params)
我的理解是總共train 100 steps,
0-10: 0 to (0.01*0.99) (grow linearly)
10-100: 0.01 to (0.01*0.01) (decay linearly)
weight_decay_rate又是用在那裡呢?
感謝
model = tf.keras.models.load_model('./my_model.h5')
ValueError Traceback (most recent call last)
in ()
----> 1 model_1 = tf.keras.models.load_model('./my_model.h5')
2
3 tf.saved_model.simple_save(
4 tf.keras.backend.get_session(),
5 "./h5_savedmodel/",
~/nm-local-dir/usercache/137602/appcache/application_1565649576840_6632543/container_e2144_1565649576840_6632543_01_000005/nbenv/nbenv/lib/python3.5/site-packages/tensorflow_core/python/keras/saving/save.py in load_model(filepath, custom_objects, compile)
144 if (h5py is not None and (
145 isinstance(filepath, h5py.File) or h5py.is_hdf5(filepath))):
--> 146 return hdf5_format.load_model_from_hdf5(filepath, custom_objects, compile)
147
148 if isinstance(filepath, six.string_types):
~/nm-local-dir/usercache/137602/appcache/application_1565649576840_6632543/container_e2144_1565649576840_6632543_01_000005/nbenv/nbenv/lib/python3.5/site-packages/tensorflow_core/python/keras/saving/hdf5_format.py in load_model_from_hdf5(filepath, custom_objects, compile)
166 model_config = json.loads(model_config.decode('utf-8'))
167 model = model_config_lib.model_from_config(model_config,
--> 168 custom_objects=custom_objects)
169
170 # set weights
~/nm-local-dir/usercache/137602/appcache/application_1565649576840_6632543/container_e2144_1565649576840_6632543_01_000005/nbenv/nbenv/lib/python3.5/site-packages/tensorflow_core/python/keras/saving/model_config.py in model_from_config(config, custom_objects)
53 'Sequential.from_config(config)
?')
54 from tensorflow.python.keras.layers import deserialize # pylint: disable=g-import-not-at-top
---> 55 return deserialize(config, custom_objects=custom_objects)
56
57
~/nm-local-dir/usercache/137602/appcache/application_1565649576840_6632543/container_e2144_1565649576840_6632543_01_000005/nbenv/nbenv/lib/python3.5/site-packages/tensorflow_core/python/keras/layers/serialization.py in deserialize(config, custom_objects)
100 module_objects=globs,
101 custom_objects=custom_objects,
--> 102 printable_module_name='layer')
~/nm-local-dir/usercache/137602/appcache/application_1565649576840_6632543/container_e2144_1565649576840_6632543_01_000005/nbenv/nbenv/lib/python3.5/site-packages/tensorflow_core/python/keras/utils/generic_utils.py in deserialize_keras_object(identifier, module_objects, custom_objects, printable_module_name)
189 custom_objects=dict(
190 list(_GLOBAL_CUSTOM_OBJECTS.items()) +
--> 191 list(custom_objects.items())))
192 with CustomObjectScope(custom_objects):
193 return cls.from_config(cls_config)
~/nm-local-dir/usercache/137602/appcache/application_1565649576840_6632543/container_e2144_1565649576840_6632543_01_000005/nbenv/nbenv/lib/python3.5/site-packages/tensorflow_core/python/keras/engine/network.py in from_config(cls, config, custom_objects)
904 """
905 input_tensors, output_tensors, created_layers = reconstruct_from_config(
--> 906 config, custom_objects)
907 model = cls(inputs=input_tensors, outputs=output_tensors,
908 name=config.get('name'))
~/nm-local-dir/usercache/137602/appcache/application_1565649576840_6632543/container_e2144_1565649576840_6632543_01_000005/nbenv/nbenv/lib/python3.5/site-packages/tensorflow_core/python/keras/engine/network.py in reconstruct_from_config(config, custom_objects, created_layers)
1840 # First, we create all layers and enqueue nodes to be processed
1841 for layer_data in config['layers']:
-> 1842 process_layer(layer_data)
1843 # Then we process nodes in order of layer depth.
1844 # Nodes that cannot yet be processed (if the inbound node
~/nm-local-dir/usercache/137602/appcache/application_1565649576840_6632543/container_e2144_1565649576840_6632543_01_000005/nbenv/nbenv/lib/python3.5/site-packages/tensorflow_core/python/keras/engine/network.py in process_layer(layer_data)
1822 from tensorflow.python.keras.layers import deserialize as deserialize_layer # pylint: disable=g-import-not-at-top
1823
-> 1824 layer = deserialize_layer(layer_data, custom_objects=custom_objects)
1825 created_layers[layer_name] = layer
1826
~/nm-local-dir/usercache/137602/appcache/application_1565649576840_6632543/container_e2144_1565649576840_6632543_01_000005/nbenv/nbenv/lib/python3.5/site-packages/tensorflow_core/python/keras/layers/serialization.py in deserialize(config, custom_objects)
100 module_objects=globs,
101 custom_objects=custom_objects,
--> 102 printable_module_name='layer')
~/nm-local-dir/usercache/137602/appcache/application_1565649576840_6632543/container_e2144_1565649576840_6632543_01_000005/nbenv/nbenv/lib/python3.5/site-packages/tensorflow_core/python/keras/utils/generic_utils.py in deserialize_keras_object(identifier, module_objects, custom_objects, printable_module_name)
178 config = identifier
179 (cls, cls_config) = class_and_config_for_serialized_keras_object(
--> 180 config, module_objects, custom_objects, printable_module_name)
181
182 if hasattr(cls, 'from_config'):
~/nm-local-dir/usercache/137602/appcache/application_1565649576840_6632543/container_e2144_1565649576840_6632543_01_000005/nbenv/nbenv/lib/python3.5/site-packages/tensorflow_core/python/keras/utils/generic_utils.py in class_and_config_for_serialized_keras_object(config, module_objects, custom_objects, printable_module_name)
163 cls = module_objects.get(class_name)
164 if cls is None:
--> 165 raise ValueError('Unknown ' + printable_module_name + ': ' + class_name)
166 return (cls, config['config'])
167
ValueError: Unknown layer: FactorizedEmbedding
from bert4keras.bert import load_pretrained_model, set_gelu
File "/home/env/anaconda3/envs/tensorflow/lib/python3.6/site-packages/bert4keras/bert.py", line 4, in
from .layers import *
File "/home/env/anaconda3/envs/tensorflow/lib/python3.6/site-packages/bert4keras/layers.py", line 321
raise Exception, 'Embedding layer not found'
您好,想问一下,单GPU运行的时候没有问题,但是当使用多GPU的时候报错是什么原因呢?
代码如下:
from keras.utils import multi_gpu_model
os.environ["CUDA_VISIBLE_DEVICES"] = "6,7"
with tf.device('/cpu:0'):
model = Model(albert_model.input, output)
model.summary()
model = multi_gpu_model(model, gpus=2)
错误如下:
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: Incompatible shapes: [256,10,10] vs. [512,1,10] [[{{node replica_0/model_2/Encoder-1-MultiHeadSelfAttention/sub_1}}]] (1) Invalid argument: Incompatible shapes: [256,10,10] vs. [512,1,10] [[{{node replica_0/model_2/Encoder-1-MultiHeadSelfAttention/sub_1}}]] [[training/Adam/gradients/replica_1/model_2/Embedding-Norm/Mean_1_grad/Shape_2/_436]]
祝好。
bert4keras/pretraining/roberta/pretraining.py
Lines 83 to 84 in 369f7d4
想請問一下有reference說這麼做會比原來的Bert還stable嗎,還是case by case 要自己試看看呢?
想請問一下,在attention matrix裡,為什麼要把Pad㵴去一個大正數呢?這樣會使attention matrix裡padding部分變的很負。如果選mode=0
效果不好嗎?
bert4keras/bert4keras/layers.py
Lines 106 to 107 in a993fc6
感謝苏神!
如题
大佬,初始化改成了build_bert_model, 原来的不兼容了。。加一点版本管理会更好哈。
There exists a problem about Tokenizer
, which not defines the add_cls
and add_sep
.
Traceback (most recent call last):
File "albert.py", line 156, in
callbacks=[evaluator])
File "/home/songuser/anaconda3/envs/keras/lib/python3.7/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/home/songuser/anaconda3/envs/keras/lib/python3.7/site-packages/keras/engine/training.py", line 1732, in fit_generator
initial_epoch=initial_epoch)
File "/home/songuser/anaconda3/envs/keras/lib/python3.7/site-packages/keras/engine/training_generator.py", line 185, in fit_generator
generator_output = next(output_generator)
File "/home/songuser/anaconda3/envs/keras/lib/python3.7/site-packages/keras/utils/data_utils.py", line 742, in get
six.reraise(*sys.exc_info())
File "/home/songuser/anaconda3/envs/keras/lib/python3.7/site-packages/six.py", line 693, in reraise
raise value
File "/home/songuser/anaconda3/envs/keras/lib/python3.7/site-packages/keras/utils/data_utils.py", line 711, in get
inputs = future.get(timeout=30)
File "/home/songuser/anaconda3/envs/keras/lib/python3.7/multiprocessing/pool.py", line 657, in get
raise self._value
File "/home/songuser/anaconda3/envs/keras/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/home/songuser/anaconda3/envs/keras/lib/python3.7/site-packages/keras/utils/data_utils.py", line 650, in next_sample
return six.next(_SHARED_SEQUENCES[uid])
File "albert.py", line 93, in forfit
for d in self.iter(True):
File "albert.py", line 81, in iter
token_ids, segment_ids = tokenizer.encode(text1, text2, max_length=MAX_LEN)
File "/home/songuser/anaconda3/envs/keras/lib/python3.7/site-packages/bert4keras/tokenizer.py", line 83, in encode
delta_1 = int(add_cls) + int(add_sep)
def encode(self,
first_text,
second_text=None,
max_length=None,
first_length=None,
second_length=None):
"""输出文本对应token id和segment id
如果传入first_length,则强行padding第一个句子到指定长度;
同理,如果传入second_length,则强行padding第二个句子到指定长度。
"""
first_tokens = self.tokenize(first_text, add_cls=False, add_sep=False)
delta_1 = int(add_cls) + int(add_sep) # add_cls and add_sep are not defined
delta_2 = int(add_cls) + int(add_sep) * 2
if second_text is None:
if max_length is not None:
first_tokens = first_tokens[:max_length - delta_1]
只存权重的方式试过没有问题,也可以顺利转成pb,都ok。
但还是好奇请问一下,我用model.save存储模型了之后,用load_model,添加了custom_objects,也在layer中添加了get_config,之后,还报shape不一致的问题。请问全模型save的方式是不是行不通?跟input shape为None是否有关系
ValueError: You called set_weights(weights)
on layer "Encoder-1-MultiHeadSelfAttention" with a weight list of length 8, but the layer was expecting 0 weights. Provided weights: [array([[ 0.03122838, 0.04661432, 0.00716374, .....
我试过了https://github.com/brightmart/albert_zh 的两个预训练模型:
albert_base_zh(额外训练了1.5亿个实例即 36k steps * batch_size 4096);
albert_base_zh(小模型体验版)
但是都报类似的错误,请问作者知道是什么原因吗?
Python2.7环境,运行task_seq2seq.py,存储模型时出错
Traceback (most recent call last):
File "/media/brx/2d79a6a5-f419-aa4c-b391-314a73033208/project/Word_vector/bert4keras/examples/task_seq2seq.py", line 210, in
callbacks=[evaluator]
File "/home/brx/.local/lib/python2.7/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/home/brx/.local/lib/python2.7/site-packages/keras/engine/training.py", line 1732, in fit_generator
initial_epoch=initial_epoch)
File "/home/brx/.local/lib/python2.7/site-packages/keras/engine/training_generator.py", line 260, in fit_generator
callbacks.on_epoch_end(epoch, epoch_logs)
File "/home/brx/.local/lib/python2.7/site-packages/keras/callbacks/callbacks.py", line 152, in on_epoch_end
callback.on_epoch_end(epoch, logs)
File "/media/brx/2d79a6a5-f419-aa4c-b391-314a73033208/project/Word_vector/bert4keras/examples/task_seq2seq.py", line 197, in on_epoch_end
model.save_weights('./best_model.weights')
File "/home/brx/.local/lib/python2.7/site-packages/keras/engine/saving.py", line 449, in save_wrapper
save_function(obj, filepath, overwrite, *args, **kwargs)
File "/home/brx/.local/lib/python2.7/site-packages/keras/engine/network.py", line 1184, in save_weights
saving.save_weights_to_hdf5_group(f, self.layers)
File "/home/brx/.local/lib/python2.7/site-packages/keras/engine/saving.py", line 761, in save_weights_to_hdf5_group
dtype=val.dtype)
File "/home/brx/.local/lib/python2.7/site-packages/h5py/_hl/group.py", line 139, in create_dataset
self[name] = dset
File "/home/brx/.local/lib/python2.7/site-packages/h5py/_hl/group.py", line 373, in setitem
h5o.link(obj.id, self.id, name, lcpl=lcpl, lapl=self._lapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5o.pyx", line 202, in h5py.h5o.link
RuntimeError: Unable to create link (name already exists)
tensorflow.python.framework.errors_impl.NotFoundError: Key bert/embeddings/word_embeddings_2 not found in checkpoint
非常謝謝苏神這麼用心把bert轉成淺顯易懂的keras版本!
我有3個基本的問題:
1。在pretraining.py
裡,
bert4keras/pretraining/roberta/pretraining.py
Lines 70 to 87 in 06a9410
mlm_loss
and mlm_acc
是被放在train_model裡的最後一層,並不是像一般放在compile的loss裡。如果是放在model裡,目前的mlm_loss和mlm_acc 是return 一個數值,它的shape = (). 這樣會不會有問題?感覺應該要是shape = (None, ). 是不是得在sum裡加個 axis = 1
?loss = K.sum(loss * is_masked, axis=1) / (K.sum(is_masked, axis=1) + K.epsilon() )
2。在pretraining裡的optimization過程程中是同時optimize loss和acc嗎?
bert4keras/pretraining/roberta/pretraining.py
Lines 111 to 114 in 06a9410
3。在data_utils.py
裡,
bert4keras/pretraining/roberta/data_utils.py
Lines 189 to 192 in 06a9410
'mlm_acc': K.zeros([1])
是不是要改成 'mlm_acc': K.ones([1])
因為accuracy max = 1?
再次感謝!
在 task_sentiment_albert.py
的 vocab.txt
中,已经有了 [PAD], [UNK], [CLS], [SEP], [unused1], [unused2]
等标记,为什么还要在 task_sentiment_albert.py
的地42~52行 手动添加呢?
然后在此处引入的keep_words
是起什么作用呢?
(keras_bert 似乎没有这个参数)
我在python3环境下,使用load_vocab,codecs在处理vocab.txt时候,第13504和344行字符(对应的id为13503和343)的时候都会被处理成空格,但是python的open没有这个问题,这个小问题主要是我再次保存词典的时候,发现少了一行
您好,在examples/task_sentiment_albert.py中使用
model.save_weights
model.load_weights
不会有问题,当使用
model.save("test.hdf5", overwrite=True, include_optimizer=True)
test_model = load_model("test.hdf5")
报错
deserialize_keras_object
': ' + class_name)
ValueError: Unknown optimizer: new_optimizer
如果把loss改掉,或者include_optimizer=False能解决这个问题,不知道这是否是一个需要优化的地方。
token_embedding = Embedding(input_dim=vocab_size,
output_dim=hidden_size,
mask_zero=True,
name='Embedding-Token')
x = MultiHeadAttention(heads=num_attention_heads,
head_size=attention_head_size,
name=attention_name)([x, x, x, a_mask])
def call(self, inputs, mask=None):
q, k, v, a_mask = inputs
v_mask = mask[2]
q_mask = mask[0]
方便bert后接大量原生支持mask的层
您好,问一下这个repo 希望遵循哪个开源协议 License ? 能考虑一下加一个 License 说明文件吗?
I'm trying to using google's pre-trained ALBERT weights for an English sentiment anaylsis task. I'm sure that the format of dataset files is right, and the data has been well loaded and tokenized.
However, when training, the train and val accs have not changed, stucking to 50%.
I'd like to know that have I missed some details to load google's pretrained weights for English data?
Following the instructions, I've downloaded and unzipped the model from https://tfhub.dev/google/albert_base/2?tf-hub-format=compressed, have created the config file.
The downloaded model has 2 pb files, one asset folder with 30k-clean.model and 30k-clean.vocab, and one variables folder with variables.index and variables.data-00000-of-00001.
Then my code is like that:
config_path = 'models/albert_base/albert_config.json'
checkpoint_path = 'models/albert_base/variables/variables'
spm_path = 'models/albert_base/assets/30k-clean.model'
tokenizer = SpTokenizer(spm_path)
albert = build_bert_model(config_path, checkpoint_path, with_pool=True,albert=True,return_keras_model=False)
when loading the model, there're the loggings:
==> searching: bert/embeddings/word_embeddings, found name: bert/embeddings/word_embeddings
==> searching: bert/embeddings/position_embeddings, found name: bert/embeddings/position_embeddings
==> searching: bert/embeddings/token_type_embeddings, found name: bert/embeddings/token_type_embeddings
==> searching: bert/embeddings/LayerNorm/gamma, found name: bert/embeddings/LayerNorm/gamma
==> searching: bert/embeddings/LayerNorm/beta, found name: bert/embeddings/LayerNorm/beta
==> searching: bert/encoder/embedding_hidden_mapping_in/kernel, found name: bert/encoder/embedding_hidden_mapping_in/kernel
==> searching: bert/encoder/embedding_hidden_mapping_in/bias, found name: bert/encoder/embedding_hidden_mapping_in/bias
==> searching: bert/encoder/transformer/group_0/inner_group_0/attention/self/query/kernel, found name: bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/kernel
==> searching: bert/encoder/transformer/group_0/inner_group_0/attention/self/query/bias, found name: bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/bias
==> searching: bert/encoder/transformer/group_0/inner_group_0/attention/self/key/kernel, found name: bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/kernel
==> searching: bert/encoder/transformer/group_0/inner_group_0/attention/self/key/bias, found name: bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/bias
==> searching: bert/encoder/transformer/group_0/inner_group_0/attention/self/value/kernel, found name: bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/kernel
==> searching: bert/encoder/transformer/group_0/inner_group_0/attention/self/value/bias, found name: bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/bias
==> searching: bert/encoder/transformer/group_0/inner_group_0/attention/output/dense/kernel, found name: bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/kernel
==> searching: bert/encoder/transformer/group_0/inner_group_0/attention/output/dense/bias, found name: bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/bias
==> searching: bert/encoder/transformer/group_0/inner_group_0/attention/output/LayerNorm/gamma, found name: bert/encoder/transformer/group_0/inner_group_0/LayerNorm/gamma
==> searching: bert/encoder/transformer/group_0/inner_group_0/attention/output/LayerNorm/beta, found name: bert/encoder/transformer/group_0/inner_group_0/LayerNorm/beta
==> searching: bert/encoder/transformer/group_0/inner_group_0/intermediate/dense/kernel, found name: bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/kernel
==> searching: bert/encoder/transformer/group_0/inner_group_0/intermediate/dense/bias, found name: bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/bias
==> searching: bert/encoder/transformer/group_0/inner_group_0/output/dense/kernel, found name: bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/kernel
==> searching: bert/encoder/transformer/group_0/inner_group_0/output/dense/bias, found name: bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/bias
==> searching: bert/encoder/transformer/group_0/inner_group_0/output/LayerNorm/gamma, found name: bert/encoder/transformer/group_0/inner_group_0/LayerNorm_1/gamma
==> searching: bert/encoder/transformer/group_0/inner_group_0/output/LayerNorm/beta, found name: bert/encoder/transformer/group_0/inner_group_0/LayerNorm_1/beta
==> searching: bert/pooler/dense/kernel, found name: bert/pooler/dense/kernel
==> searching: bert/pooler/dense/bias, found name: bert/pooler/dense/bias
Does it mean that the model is well loaded? If so, what is the reason that the model and training process do not work at all? Thanks.
是已经支持tf2.0了么?想再次确认一下
您能指点下哪里需要修改吗?感恩~
bert4keras 0.2.4
Kears 2.3.1
模型代码:
albert_model = build_bert_model(config_path, checkpoint_path, albert=True)
out = Lambda(lambda x: x[: 0])(albert_model.output)
output = Dense(units=class_num, activation = 'softmax')(out)
model = Model(albert_model.input, output)
mdel.compile(loss = model_loss, optimizer = optimizer, metrics=["categorical_accuracy"]
模型使用keras的 model.save保存 使用 load_model加载
from bert4keras.layers import *
model = load_model(os.path.join(model_dir, "albert.m"))
报错
ValueError: Unkown layer : PositionEmbedding
如果 from bert4keras.layers import custom_object
之后
model = load_model(os.path.join(model_dir, "albert.m"), custom_object=custom_object)
报错
AttributeError: 'tuple' object has no attribute 'layer'
大佬你好,我在构建seq2seq的bert时出现
TypeError: Expected float32 passed to parameter 'y' of op 'Equal', got 'history_only' of type 'str' instead. Error: Expected float32, got 'history_only' of type 'str' instead.
环境为python3.6 tensorflow2.0
这个类型错误,发现是在layer.py中有
if a_mask is not None:
if a_mask == 'history_only':
ones = K.ones_like(a[:1])
a_mask = (ones - tf.linalg.band_part(ones, -1, 0)) * 1e12
a = a - a_mask
else:
a = a - (1 - a_mask) * 1e12
应该是a_mask是张量,不允许和字符串用做对比吧,所以我改了一下
将if a_mask == 'history_only':改为
if isinstance(a_mask,str):
这样可以成功加载模型
I post this to find if I'm doing the write thing.
I just add dense and softmax to fine tune the model
albert_model = build_bert_model(config_path, checkpoint_path, albert=True)
out = Lambda(lambda x: x[: 0])(albert_model.output)
output = Dense(units=class_num, activation = 'softmax')(out)
after I trained the model, I try to load the model by
model = load_model (model.dir)
and I get the error like 'I miss the custom layer 'TokenEmbedding'
after that, I try
custom_objects = {'MaskedGlobalPool1D: MaskedGlobalPool1D}
custom_objects.update(get_bert_custom_objects())
get_bert_custom_objects() is come from keras_bert, basically just define some custom layer
while MaskedGlobalPool1D from keras_bert aiming to get rid of the mask of the output of the model.
I don't know if I'm doing right, since the prediction is not good enough.
Can someone explain what is the TokenEmbedding layer, the dese layer I defined?
大佬,打扰了,我想问一下因为我的cuda版本是9.x的,但是如果tensorflow版本是1.13+的话,要求cuda版本10的,可以问一下之前能支持低版本tensorflow的包还在吗?
我看了您博客介绍的 tf.keras 可以使用多 GPU 与 TPU 训练。
于是对 Semantic 范例做了如下修改,我使用的环境是 Colab 自带的 TPU
tensorflow 1.15
bert4keras 0.3.4
import os
os.environ['TF_KERAS'] = '1'
import json
import numpy as np
import codecs
import tensorflow as tf
import tensorflow.keras as keras
import tensorflow.keras.backend as K
from bert4keras.backend import set_gelu
from bert4keras.tokenizer import Tokenizer
from bert4keras.bert import build_bert_model
from bert4keras.optimizers import Adam, extend_with_piecewise_linear_lr
from bert4keras.snippets import sequence_padding, get_all_attributes
locals().update(get_all_attributes(keras.layers))
set_gelu('tanh')
### 中间与范例相同,未做改动 ###
# TF1 TPU
resolver = tf.contrib.cluster_resolver.TPUClusterResolver(tpu='grpc://' + os.environ['COLAB_TPU_ADDR'])
tf.tpu.experimental.initialize_tpu_system(resolver)
strategy = tf.distribute.experimental.TPUStrategy(resolver)
# Adam Learning Rate
AdamLR = extend_with_piecewise_linear_lr(Adam)
with strategy.scope():
# 加载预训练模型
bert = build_bert_model(
config_path=config_path,
checkpoint_path=checkpoint_path,
with_pool=True,
albert=True,
return_keras_model=False,
)
output = Dropout(rate=0.1)(bert.model.output)
output = Dense(units=2,
activation='softmax',
kernel_initializer=bert.initializer)(output)
model = keras.models.Model(bert.model.input, output)
model.compile(
loss='sparse_categorical_crossentropy',
# optimizer=Adam(1e-5), # 用足够小的学习率
optimizer=AdamLR(learning_rate=1e-4,lr_schedule={1000: 1, 2000: 0.1}),
metrics=['accuracy'])
model.summary()
### 中间与范例相同,未做改动 ###
model.fit_generator(train_generator.forfit(),
steps_per_epoch=len(train_generator),
epochs=10,
callbacks=[evaluator])
到 fit_generator 这一步报错 ``fit_generator is not supported for models compiled with tf.distribute.Strategy.
我查了一些资料得知 fit
也可以使用生成器,可是在 keras 的文档中并未提及如何设参。
还是一定要使用 tf.data
呢?
后续会有关于在SQuAD上做阅读理解的例子代码吗?
试了下面的方法,报错 不知道是不是姿势不对, 如果可以的话 ,望老师解答一下
`
bert = build_bert_model(
config_path=config_path,
checkpoint_path=checkpoint_path,
with_pool=False,
return_keras_model=True )
x1= K.function([bert.layers[0].input],
[bert.layers[-2].output])
x2 = K.function([bert.layers[0].input],
[bert.layers[-3].output])
`
加载完预训练模型并且微调后保存成了新的keras.h5模型 请问这个新模型怎么加载呢 custom_objects这个参数应该怎么填呢
简单修改了task_seq2seq.py脚本中read_texts部分,使其输入我要的数据
其他源码未做任何修改
环境python 3.6.4 Keras 2.3.1 Tensorflow 2.0 bert4keras 0.2.6, OS Ubuntu
现在发现两部分问题:
1.当使用tf 2.0后端 (按说明的方法设置环境变量)
模型可以成功build,但是在add_loss部分报错
2019-11-14 17:57:47.522304: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Invalid argument: You must feed a value for placeholder tensor 'Input-Segment' with dtype float and shape [?,?]
[[{{node Input-Segment}}]]
Traceback (most recent call last):
File "task_seq2seq.py", line 159, in
model.add_loss(cross_entropy)
我内网粘贴不出来错误信息,以上手打,后面不打了。。
2.在使用keras后端时,model build部分不能通过(没有打印出summary)
错误信息
TypeError: int() argument must be a string, a bytes-like objobject or a number, not 'Tensor'
......
During handling of the above exception, another exception occured
.....
ValueError: Duplicate node name in graph: 'Attention-Mask/ones/packed'
希望得到帮助,如需其他信息请讲
You called set_weights(weights)
on layer "Encoder-1-MultiHeadSelfAttention" with a weight list of length 8, but the layer was expecting 16 weights. Provided weights: [array([[-0.01839516, -0.02298158, 0.07288713, .....
报错如上。
求大佬帮助
bert4keras/bert.py 文件中 if self.with_mlm
情况下:
x = EmbeddingDense(embedding_name='Embedding-Token', activation=self.with_mlm, name='MLM-Proba')(x)
中的 activation=self.with_mlm
语句会报错,
EmbeddingDense 中对应语句为 self.activation = activations.get(activation)
,activation 应该为激活函数名称,这里好像传了一个布尔值。
您好,感谢您提供的工具!
我遇到一个问题,不清楚是不是bug,想请您帮忙看一下。
在测试引用的时候,仅仅只是做import如下代码
import os
os.environ['TF_KERAS'] = '1'
from bert4keras.bert import load_pretrained_model, set_gelu
from bert4keras.utils import SimpleTokenizer, load_vocab
from bert4keras.train import PiecewiseLinearLearningRate
set_gelu('tanh')
会得到错误
···
File "/home/flydsc/anaconda3/envs/main_work/lib/python3.7/site-packages/bert4keras/bert.py", line 4, in
from .layers import *
File "/home/flydsc/anaconda3/envs/main_work/lib/python3.7/site-packages/bert4keras/layers.py", line 60, in
class OurLayer(Layer):
NameError: name 'Layer' is not defined
···
但是当我去掉前两行关于环境变量的设定,即:
from bert4keras.bert import load_pretrained_model, set_gelu
from bert4keras.utils import SimpleTokenizer, load_vocab
from bert4keras.train import PiecewiseLinearLearningRate
set_gelu('tanh')
import成功
猜测是TensorFlow.keras 在 ·globals().update(keras.layers.dict)· 的兼容性有一点问题?
我的环境是
python 3.7
TensorFlow 1.14.0
再次感谢您的无私。
load_model后用同一批测试集每次测试结果都不一样。苏神能否在example中给出load_model/load_weights的例子。
models.load_model无法加载models.save_model保存的模型,提升ValueError: Unknown layer: FactorizedEmbedding,我看示例里面是 save_weights,但是我希望模型结构也保存起来
Line 332 in a6f8a0d
用example/task_sentiment_albert.py加载徐亮的tiny版albert ,出现下面错误
ValueError: Layer weight shape (21128, 312) not compatible with provided weight shape (21128, 128)
用keras_bert加载也出现同样的错,这是怎么回事啊?
Line 245 in f5741a4
加载google的时候,提示attempt to get argmax of an empty sequence,如是我加了个判断,其实根本原因应该是因为进行了pop操作,而hiddern有多层,由于albert参数共享机制,variable_names参数个数严重少于实际情况,导致出错。
由于实际原因,线上往往是旧版本,上个bert2keras挺好的,支持keras2.2,升级到2.3后就如网友说的set_weights的情况,只能舍弃了2.2了。
from bert4keras.bert import load_pretrained_model as load_trained_model_from_checkpoint
from bert4keras.utils import SimpleTokenizer as Tokenizer
from keras.layers import Input, TimeDistributed
from keras.models import Model
config_path = './chinese_L-12_H-768_A-12/bert_config.json'
checkpoint_path = './chinese_L-12_H-768_A-12/bert_model.ckpt'
dict_path = './chinese_L-12_H-768_A-12/vocab.txt'
bert_model = load_trained_model_from_checkpoint(config_path,
checkpoint_path)
for l in bert_model.layers:
l.trainable = True
MAX_SENTENCE_LENGTH = 128
MAX_SENTENCE_COUNT = 64
x1_in = Input(shape=(MAX_SENTENCE_LENGTH, ), dtype='int32')
x2_in = Input(shape=(MAX_SENTENCE_LENGTH, ), dtype='int32')
x1, x2 = x1_in, x2_in
sentence = bert_model([x1, x2])
# sentence = Lambda(lambda x: x[:, 0])(sentence)
model1 = Model([x1_in, x2_in], sentence)
model1.summary()
texts_in = Input(shape=(MAX_SENTENCE_COUNT, MAX_SENTENCE_LENGTH, 2),
dtype='int32')
attention_weighted_sentences = TimeDistributed(model1)(texts_in)
model = Model(texts_in, attention_weighted_sentences)
model.summary()
__________________________________________________________________________________________________
Traceback (most recent call last):
File "/home/phoenixkiller/.vscode/extensions/ms-python.python-2019.9.34911/pythonFiles/ptvsd_launcher.py", line 43, in <module>
main(ptvsdArgs)
File "/home/phoenixkiller/.vscode/extensions/ms-python.python-2019.9.34911/pythonFiles/lib/python/ptvsd/__main__.py", line 432, in main
run()
File "/home/phoenixkiller/.vscode/extensions/ms-python.python-2019.9.34911/pythonFiles/lib/python/ptvsd/__main__.py", line 316, in run_file
runpy.run_path(target, run_name='__main__')
File "/home/phoenixkiller/anaconda3/envs/keras_debug/lib/python3.6/runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "/home/phoenixkiller/anaconda3/envs/keras_debug/lib/python3.6/runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "/home/phoenixkiller/anaconda3/envs/keras_debug/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/phoenixkiller/source/bert4keras/test copy.py", line 36, in <module>
attention_weighted_sentences = TimeDistributed(model1)(texts_in)
File "/home/phoenixkiller/source/keras/keras/engine/base_layer.py", line 457, in __call__
output = self.call(inputs, **kwargs)
File "/home/phoenixkiller/source/keras/keras/layers/wrappers.py", line 248, in call
y = self.layer.call(inputs, **kwargs)
File "/home/phoenixkiller/source/keras/keras/engine/network.py", line 564, in call
output_tensors, _, _ = self.run_internal_graph(inputs, masks)
File "/home/phoenixkiller/source/keras/keras/engine/network.py", line 798, in run_internal_graph
assert str(id(x)) in tensor_map, 'Could not compute output ' + str(x)
AssertionError: Could not compute output Tensor("model_1/Encoder-12-FeedForward-Norm/add_1:0", shape=(?, 128, 768), dtype=float32)
windows环境下报了个错:
Connected to pydev debugger (build 193.5662.61)
2019-12-24 11:52:07.952336: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
构建词汇表中: 0it [00:00, ?it/s]Traceback (most recent call last):
File "D:\Anaconda3\envs\tf2\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "D:\Anaconda3\envs\tf2\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'parallel_apply..worker_step'
2019-12-24 11:52:13.359467: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
Traceback (most recent call last):
File "D:\Anaconda3\envs\tf2\lib\multiprocessing\spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input
但是Ubuntu下不会报错,这个是为啥?
0it [00:00, ?it/s]Traceback (most recent call last):
File "D:/pycharm/bert4keras/examples/task_relation_extraction.py", line 323, in
callbacks=[evaluator, EMAer])
File "E:\Anaconda\envs\keras\lib\site-packages\keras\legacy\interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "E:\Anaconda\envs\keras\lib\site-packages\keras\engine\training.py", line 1418, in fit_generator
initial_epoch=initial_epoch)
File "E:\Anaconda\envs\keras\lib\site-packages\keras\engine\training_generator.py", line 251, in fit_generator
callbacks.on_epoch_end(epoch, epoch_logs)
File "E:\Anaconda\envs\keras\lib\site-packages\keras\callbacks.py", line 79, in on_epoch_end
callback.on_epoch_end(epoch, logs)
File "D:/pycharm/bert4keras/examples/task_relation_extraction.py", line 305, in on_epoch_end
f1, precision, recall = evaluate(valid_data)
File "D:/pycharm/bert4keras/examples/task_relation_extraction.py", line 272, in evaluate
R = set([SPO(spo) for spo in extract_spoes(d['text'])])
File "D:/pycharm/bert4keras/examples/task_relation_extraction.py", line 272, in
R = set([SPO(spo) for spo in extract_spoes(d['text'])])
File "D:/pycharm/bert4keras/examples/task_relation_extraction.py", line 251, in init
super(SPO, self).init(spo)
TypeError: object.init() takes no parameters
0it [00:03, ?it/s]
目前看範例只有ckpt,是否有直接使用https://tfhub.dev/google/albert_base/2的方法、範例?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.