Git Product home page Git Product logo

Comments (1)

ChanningPing avatar ChanningPing commented on September 6, 2024

I'm having similar issue:
`06/29/2020 20:52:32 - INFO - main - device: cuda n_gpu: 4, distributed training: False, 16-bits training: False
06/29/2020 20:52:32 - INFO - pytorch_pretrained_bert.tokenization - loading vocabulary file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /home/ubuntu/.pytorch_pretrained_bert/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
06/29/2020 20:52:33 - INFO - vilbert.task_utils - Loading RetrievalFlickr30k Dataset with batch size 1
06/29/2020 20:52:37 - INFO - vilbert.vilbert - loading archive file save/bert_base_6_layer_6_connect/pytorch_model_9.bin
06/29/2020 20:52:37 - INFO - vilbert.vilbert - Model config {
"attention_probs_dropout_prob": 0.1,
"bi_attention_type": 1,
"bi_hidden_size": 1024,
"bi_intermediate_size": 1024,
"bi_num_attention_heads": 8,
"fast_mode": true,
"fixed_t_layer": 0,
"fixed_v_layer": 0,
"fusion_method": "mul",
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"in_batch_pairs": false,
"initializer_range": 0.02,
"intermediate_size": 3072,
"intra_gate": false,
"max_position_embeddings": 512,
"num_attention_heads": 12,
"num_hidden_layers": 12,
"pooling_method": "mul",
"predict_feature": false,
"t_biattention_id": [
6,
7,
8,
9,
10,
11
],
"type_vocab_size": 2,
"v_attention_probs_dropout_prob": 0.1,
"v_biattention_id": [
0,
1,
2,
3,
4,
5
],
"v_feature_size": 2048,
"v_hidden_act": "gelu",
"v_hidden_dropout_prob": 0.1,
"v_hidden_size": 1024,
"v_initializer_range": 0.02,
"v_intermediate_size": 1024,
"v_num_attention_heads": 8,
"v_num_hidden_layers": 6,
"v_target_size": 1601,
"vocab_size": 30522,
"with_coattention": true
}

model's option for predict_feature is False
06/29/2020 20:52:44 - INFO - vilbert.vilbert - Weights from pretrained model not used in BertForMultiModalPreTraining: ['bert.encoder.v_layer.6.attention.self.query.weight', 'bert.encoder.v_layer.6.attention.self.query.bias', 'bert.encoder.v_layer.6.attention.self.key.weight', 'bert.encoder.v_layer.6.attention.self.key.bias', 'bert.encoder.v_layer.6.attention.self.value.weight', 'bert.encoder.v_layer.6.attention.self.value.bias', 'bert.encoder.v_layer.6.attention.output.dense.weight', 'bert.encoder.v_layer.6.attention.output.dense.bias', 'bert.encoder.v_layer.6.attention.output.LayerNorm.weight', 'bert.encoder.v_layer.6.attention.output.LayerNorm.bias', 'bert.encoder.v_layer.6.intermediate.dense.weight', 'bert.encoder.v_layer.6.intermediate.dense.bias', 'bert.encoder.v_layer.6.output.dense.weight', 'bert.encoder.v_layer.6.output.dense.bias', 'bert.encoder.v_layer.6.output.LayerNorm.weight', 'bert.encoder.v_layer.6.output.LayerNorm.bias', 'bert.encoder.v_layer.7.attention.self.query.weight', 'bert.encoder.v_layer.7.attention.self.query.bias', 'bert.encoder.v_layer.7.attention.self.key.weight', 'bert.encoder.v_layer.7.attention.self.key.bias', 'bert.encoder.v_layer.7.attention.self.value.weight', 'bert.encoder.v_layer.7.attention.self.value.bias', 'bert.encoder.v_layer.7.attention.output.dense.weight', 'bert.encoder.v_layer.7.attention.output.dense.bias', 'bert.encoder.v_layer.7.attention.output.LayerNorm.weight', 'bert.encoder.v_layer.7.attention.output.LayerNorm.bias', 'bert.encoder.v_layer.7.intermediate.dense.weight', 'bert.encoder.v_layer.7.intermediate.dense.bias', 'bert.encoder.v_layer.7.output.dense.weight', 'bert.encoder.v_layer.7.output.dense.bias', 'bert.encoder.v_layer.7.output.LayerNorm.weight', 'bert.encoder.v_layer.7.output.LayerNorm.bias', 'bert.encoder.c_layer.6.biattention.query1.weight', 'bert.encoder.c_layer.6.biattention.query1.bias', 'bert.encoder.c_layer.6.biattention.key1.weight', 'bert.encoder.c_layer.6.biattention.key1.bias', 'bert.encoder.c_layer.6.biattention.value1.weight', 'bert.encoder.c_layer.6.biattention.value1.bias', 'bert.encoder.c_layer.6.biattention.query2.weight', 'bert.encoder.c_layer.6.biattention.query2.bias', 'bert.encoder.c_layer.6.biattention.key2.weight', 'bert.encoder.c_layer.6.biattention.key2.bias', 'bert.encoder.c_layer.6.biattention.value2.weight', 'bert.encoder.c_layer.6.biattention.value2.bias', 'bert.encoder.c_layer.6.biOutput.dense1.weight', 'bert.encoder.c_layer.6.biOutput.dense1.bias', 'bert.encoder.c_layer.6.biOutput.LayerNorm1.weight', 'bert.encoder.c_layer.6.biOutput.LayerNorm1.bias', 'bert.encoder.c_layer.6.biOutput.q_dense1.weight', 'bert.encoder.c_layer.6.biOutput.q_dense1.bias', 'bert.encoder.c_layer.6.biOutput.dense2.weight', 'bert.encoder.c_layer.6.biOutput.dense2.bias', 'bert.encoder.c_layer.6.biOutput.LayerNorm2.weight', 'bert.encoder.c_layer.6.biOutput.LayerNorm2.bias', 'bert.encoder.c_layer.6.biOutput.q_dense2.weight', 'bert.encoder.c_layer.6.biOutput.q_dense2.bias', 'bert.encoder.c_layer.6.v_intermediate.dense.weight', 'bert.encoder.c_layer.6.v_intermediate.dense.bias', 'bert.encoder.c_layer.6.v_output.dense.weight', 'bert.encoder.c_layer.6.v_output.dense.bias', 'bert.encoder.c_layer.6.v_output.LayerNorm.weight', 'bert.encoder.c_layer.6.v_output.LayerNorm.bias', 'bert.encoder.c_layer.6.t_intermediate.dense.weight', 'bert.encoder.c_layer.6.t_intermediate.dense.bias', 'bert.encoder.c_layer.6.t_output.dense.weight', 'bert.encoder.c_layer.6.t_output.dense.bias', 'bert.encoder.c_layer.6.t_output.LayerNorm.weight', 'bert.encoder.c_layer.6.t_output.LayerNorm.bias', 'bert.encoder.c_layer.7.biattention.query1.weight', 'bert.encoder.c_layer.7.biattention.query1.bias', 'bert.encoder.c_layer.7.biattention.key1.weight', 'bert.encoder.c_layer.7.biattention.key1.bias', 'bert.encoder.c_layer.7.biattention.value1.weight', 'bert.encoder.c_layer.7.biattention.value1.bias', 'bert.encoder.c_layer.7.biattention.query2.weight', 'bert.encoder.c_layer.7.biattention.query2.bias', 'bert.encoder.c_layer.7.biattention.key2.weight', 'bert.encoder.c_layer.7.biattention.key2.bias', 'bert.encoder.c_layer.7.biattention.value2.weight', 'bert.encoder.c_layer.7.biattention.value2.bias', 'bert.encoder.c_layer.7.biOutput.dense1.weight', 'bert.encoder.c_layer.7.biOutput.dense1.bias', 'bert.encoder.c_layer.7.biOutput.LayerNorm1.weight', 'bert.encoder.c_layer.7.biOutput.LayerNorm1.bias', 'bert.encoder.c_layer.7.biOutput.q_dense1.weight', 'bert.encoder.c_layer.7.biOutput.q_dense1.bias', 'bert.encoder.c_layer.7.biOutput.dense2.weight', 'bert.encoder.c_layer.7.biOutput.dense2.bias', 'bert.encoder.c_layer.7.biOutput.LayerNorm2.weight', 'bert.encoder.c_layer.7.biOutput.LayerNorm2.bias', 'bert.encoder.c_layer.7.biOutput.q_dense2.weight', 'bert.encoder.c_layer.7.biOutput.q_dense2.bias', 'bert.encoder.c_layer.7.v_intermediate.dense.weight', 'bert.encoder.c_layer.7.v_intermediate.dense.bias', 'bert.encoder.c_layer.7.v_output.dense.weight', 'bert.encoder.c_layer.7.v_output.dense.bias', 'bert.encoder.c_layer.7.v_output.LayerNorm.weight', 'bert.encoder.c_layer.7.v_output.LayerNorm.bias', 'bert.encoder.c_layer.7.t_intermediate.dense.weight', 'bert.encoder.c_layer.7.t_intermediate.dense.bias', 'bert.encoder.c_layer.7.t_output.dense.weight', 'bert.encoder.c_layer.7.t_output.dense.bias', 'bert.encoder.c_layer.7.t_output.LayerNorm.weight', 'bert.encoder.c_layer.7.t_output.LayerNorm.bias']
Num Iters: {'TASK3': 10000}
Batch size: {'TASK3': 1}
Traceback (most recent call last):
File "eval_retrieval.py", line 275, in
main()
File "eval_retrieval.py", line 230, in main
score_matrix[caption_idx, image_idx*500:(image_idx+1)*500] = torch.softmax(vil_logit, dim=1)[:,0].view(-1).cpu().numpy()
ValueError: could not broadcast input array from shape (125) into shape (500)
`
I followed exactly as the instructions for zero-shot image retrieval. Any ideas what is missing?

from vilbert_beta.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.