model loading the checkpoint error about transformers HOT 18 CLOSED

TIANRENK commented on May 7, 2024

model loading the checkpoint error

from transformers.

Comments (18)

TIANRENK commented on May 7, 2024

But I print the model.embeddings.token_type_embeddings it was Embedding(16,768) .

from transformers.

thomwolf commented on May 7, 2024

which model are you loading?

from transformers.

TIANRENK commented on May 7, 2024

which model are you loading?

the pre-trained model chinese_L-12_H-768_A-12

from transformers.

TIANRENK commented on May 7, 2024

mycode:
bert_config = BertConfig.from_json_file('bert_config.json')
model=BertModel(bert_config)
model.load_state_dict(torch.load('pytorch_model.bin'))

The error:
RuntimeError: Error(s) in loading state_dict for BertModel:
size mismatch for embeddings.token_type_embeddings.weight: copying a param of torch.Size([16, 768]) from checkpoint, where the shape is torch.Size([2, 768]) in current model.

from transformers.

thomwolf commented on May 7, 2024

I'm testing the chinese model.
Do you use the config.json of the chinese_L-12_H-768_A-12 ?
Can you send the content of your config_json ?

from transformers.

TIANRENK commented on May 7, 2024

I'm testing the chinese model.
Do you use the config.json of the chinese_L-12_H-768_A-12 ?
Can you send the content of your config_json ?

In the 'config.json' of the chinese_L-12_H-768_A-12 ,the type_vocab_size=2.But I change the config.type_vocab_size=16, it still error.

from transformers.

TIANRENK commented on May 7, 2024

I'm testing the chinese model.
Do you use the config.json of the chinese_L-12_H-768_A-12 ?
Can you send the content of your config_json ?

{
"attention_probs_dropout_prob": 0.1,
"directionality": "bidi",
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"max_position_embeddings": 512,
"num_attention_heads": 12,
"num_hidden_layers": 12,
"pooler_fc_size": 768,
"pooler_num_attention_heads": 12,
"pooler_num_fc_layers": 3,
"pooler_size_per_head": 128,
"pooler_type": "first_token_transform",
"type_vocab_size": 2,
"vocab_size": 21128
}

I change my code:
bert_config = BertConfig.from_json_file('bert_config.json')
bert_config.type_vocab_size=16
model=BertModel(bert_config)
model.load_state_dict(torch.load('pytorch_model.bin'))

it still error.

from transformers.

TIANRENK commented on May 7, 2024

I see you have "type_vocab_size": 2 in your config file, how is that?

Yes,but I change it in my code.

from transformers.

TIANRENK commented on May 7, 2024

is your pytorch_model.bin the good converted model of the chinese one (and not of an English one)?

I think it's good.

from transformers.

thomwolf commented on May 7, 2024

Ok, I have the models. I think type_vocab_size should be 2 also for chinese. I am wondering why it is 16 in your pytorch_model.bin

from transformers.

TIANRENK commented on May 7, 2024

I have no idea.Did my model make the wrong convert?

from transformers.

thomwolf commented on May 7, 2024

I am testing that right now. I haven't played with the multi-lingual models yet.

from transformers.

TIANRENK commented on May 7, 2024

I am testing that right now. I haven't played with the multi-lingual models yet.

I also use it for the first time.I am looking forward to your test results.

from transformers.

TIANRENK commented on May 7, 2024

I am testing that right now. I haven't played with the multi-lingual models yet.

When I was converting the model .

Traceback (most recent call last):
File "convert_tf_checkpoint_to_pytorch.py", line 95, in
convert()
File "convert_tf_checkpoint_to_pytorch.py", line 85, in convert
assert pointer.shape == array.shape
AssertionError: (torch.Size([16, 768]), (2, 768))

from transformers.

thomwolf commented on May 7, 2024

are you supplying a config file with "type_vocab_size": 2 to the conversion script?

from transformers.

TIANRENK commented on May 7, 2024

are you supplying a config file with "type_vocab_size": 2 to the conversion script?

I used the 'bert_config.json' of the chinese_L-12_H-768_A-12 when I was converting .

from transformers.

thomwolf commented on May 7, 2024

Ok, I think I found the issue, your BertConfig is not build from the configuration file for some reason and thus use the default value of type_vocab_size in BertConfig which is 16.

This error happen on my system when I use config = BertConfig('bert_config.json') instead of config = BertConfig.from_json_file('bert_config.json').

I will make sure these two ways of initializing the configuration file (from parameters or from json file) cannot be messed up.

from transformers.

imxiaomin commented on May 7, 2024

运行时错误：加载 BertModel state_dict时出错：embeddings.token_type_embeddings 的大小不匹配.weight：
复制火炬参数。大小（[16， 768]）从检查点开始，其中形状为火炬。当前模型中的大小（[2， 768]

i have the same problem as you. did you solve the problem?

from transformers.

model loading the checkpoint error about transformers HOT 18 CLOSED

Comments (18)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent