guotong1988 / nl2sql-rule Goto Github PK

View Code? Open in Web Editor NEW

183.0 7.0 48.0 6.48 MB

Content Enhanced BERT-based Text-to-SQL Generation https://arxiv.org/abs/1910.07179

Python 60.86% Jupyter Notebook 39.14%

nl2sql pytorch semantic-parsing nlp bert knowledge-representation knowledge deep-learning text2sql rule-inject-to-model

nl2sql-rule's Introduction

Hi! My name is Tong Guo.

Reviewer of ACL-2023, NAACL-2022/2024, EMNLP-2022/2023 Industry Track.

Another email: [email protected]

Never Give Up. Learn From Failure.

Print is the best debugging method.

nl2sql-rule's People

Contributors

Stargazers

Watchers

nl2sql-rule's Issues

missing training data

Hey,

Could verify the dataset

'./data_and_model/train_tok.jsonl'

'./data_and_model/train_knowledge.jsonl'

These are same ? Because your code reading train_tok but it's missing

problem with knowledge when running inference

Thanks for the pretrained model! I was trying to run inference using python3 ./train.py --trained --do_infer, but get this error:

Traceback (most recent call last):
  File "./train.py", line 799, in <module>
    beam_size=1, show_table=False, show_answer_only=False
  File "./train.py", line 632, in infer
    beam_size=beam_size)
  File "/base/sqlova/model/nl2sql/wikisql_models.py", line 115, in beam_forward
    knowledge=knowledge, knowledge_header=knowledge_header)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/base/sqlova/model/nl2sql/wikisql_models.py", line 562, in forward
    knowledge = [k + (mL_n - len(k)) * [0] for k in knowledge]
TypeError: 'NoneType' object is not iterable

I've placed something in data_and_model/ctable.tables.jsonl and data_and_model/ctable.db. I've placed a ctable_knowledge.jsonl file in a few places, but I don't see how it would be read. What am I doing wrong?

Killed - python train.py - CPU

BERT-type: uncased_L-12_H-768_A-12
Batch_size = 8
BERT parameters:
learning rate: 1e-05
Fine-tune BERT: True
vocab size: 30522
hidden_size: 768
num_hidden_layer: 12
num_attention_heads: 12
hidden_act: gelu
intermediate_size: 3072
hidden_dropout_prob: 0.1
attention_probs_dropout_prob: 0.1
max_position_embeddings: 512
type_vocab_size: 2
initializer_range: 0.02
Load pre-trained parameters.
Seq-to-SQL: the number of final BERT layers to be used: 2
Seq-to-SQL: the size of hidden dimension = 100
Seq-to-SQL: LSTM encoding layer size = 2
Seq-to-SQL: dropout rate = 0.3
Seq-to-SQL: learning rate = 0.001
Killed

Problem while running train.py

Hello, I tried to clone your repo and run it on my local system. However when I try a normal python train.py or the command mentioned in sqlova repository, I get the following error

python train.py

BERT-type: uncased_L-12_H-768_A-12
Batch_size = 32
BERT parameters:
learning rate: 1e-05
Fine-tune BERT: False
Traceback (most recent call last):
  File "train.py", line 703, in <module>
    model, model_bert, tokenizer, bert_config = get_models(args, BERT_PT_PATH)
  File "train.py", line 164, in get_models
    args.no_pretraining)
  File "train.py", line 124, in get_bert
    bert_config.print_status()
AttributeError: 'BertConfig' object has no attribute 'print_status'

When I comment out bert_config.print_status(), I get another error as following

BERT-type: uncased_L-12_H-768_A-12
Batch_size = 32
BERT parameters:
learning rate: 1e-05
Fine-tune BERT: False
Traceback (most recent call last):
  File "train.py", line 703, in <module>
    model, model_bert, tokenizer, bert_config = get_models(args, BERT_PT_PATH)
  File "train.py", line 164, in get_models
    args.no_pretraining)
  File "train.py", line 126, in get_bert
    model_bert = BertModel(bert_config)
TypeError: __init__() missing 2 required positional arguments: 'is_training' and 'input_ids'

Any solution to this?

train.py

NL2SQL-RULE>python train.py
BERT-type: uncased_L-12_H-768_A-12
Batch_size = 8
BERT parameters:
learning rate: 1e-05
Fine-tune BERT: True
vocab size: 30522
hidden_size: 768
num_hidden_layer: 12
num_attention_heads: 12
hidden_act: gelu
intermediate_size: 3072
hidden_dropout_prob: 0.1
attention_probs_dropout_prob: 0.1
max_position_embeddings: 512
type_vocab_size: 2
initializer_range: 0.02
Load pre-trained parameters.
Seq-to-SQL: the number of final BERT layers to be used: 2
Seq-to-SQL: the size of hidden dimension = 100
Seq-to-SQL: LSTM encoding layer size = 2
Seq-to-SQL: dropout rate = 0.3
Seq-to-SQL: learning rate = 0.001
Traceback (most recent call last):
File "train.py", line 726, in
dset_name="train")
File "train.py", line 221, in train
for iB, t in enumerate(train_loader):
File "C:\Users...\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 278, in iter
return _MultiProcessingDataLoaderIter(self)
File "C:\Users...\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 682, in init
w.start()
File "C:\Users...\Anaconda3\lib\multiprocessing\process.py", line 112, in start
self._popen = self._Popen(self)
File "C:\Users...\Anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Users...\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Users...\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 89, in init
reduction.dump(process_obj, to_child)
File "C:\Users...\Anaconda3\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'get_loader_wikisql..'

...NL2SQL-RULE>Traceback (most recent call last):
File "", line 1, in
File "C:\Users...\Anaconda3\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Users...\Anaconda3\lib\multiprocessing\spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

两个额外特征

您在论文里面提到引入两个额外的特征：问句和表格中的cell的match向量，问句和列的match向量，为了更好的理解，我尝试去您的代码中找这块的代码，但不太能找得到，能麻烦大神能跟我说下这块代码在哪个脚本里嘛？

Updates on Large Models.

HI is there any updates on fine-tuned model available for
uncased_L-24_H-1024_A-16 bert model.

Test Data

If a have a SQL table how can I convert it to test_data and test_table to find answers to a question related to the test database table. Can you please help me with this issue.

How much time to train this model

Hi, you've tried to iterate 200 times in the whole training process. But I ran your code and found one epoch would takes me several hours.

How do I use this for my dataset?

Does this model support converting Chinese language to SQL?

No module named '_sqlite3'

TODO, add the data type info of column into model

Why no use of BertTokenizer?

Hello,
why there is no use of BertTokenizer and what is the advantage of the custom BasicTokenizer?

Best Regards!

AttributeError: ‘RMKeyView‘ object has no attribute ‘index‘

Key differences between NL2SQL-BERT and SQLova?

What is the main difference between this and SQLova models?
Can this model perform tasks on Multiple Tables or join operations?
How do I infer after converting my data to similar to train data?
SOTA for this Text to SQL task.

Testing models with simple prediction - Generate predictions JSON files

I have 2 questions very close :

1)How can i test the model using simple user question and schema table ? simply, how to do simple prediction ?

2)How to generate predictions JSON files (for WikiSQL TrainigSet and DevSet ?)

Thank you very much for helping me.

evaluate how to use

您好，请问您是如何使用do_infer的， ctable_ftable1这些数据如何获得？
谢谢！

from sqlnet.dbengine import DBEngine ModuleNotFoundError: No module named 'sqlnet.dbengine'

found this problem with sqlnet==0.1

Evaluate cmd statement

What's the correct statement to run the evaluate.py?

CRF support

ERROR when running train.py (RuntimeError: Error(s) in loading state_dict for Seq2SQL_v1)

Hi,
I try to run python3 train.py --trained --bert_type_abb uS but it gives me this error : RuntimeError: Error(s) in loading state_dict for Seq2SQL_v1:

Details of execution is below :

XXXX@YYYY:/mnt/c/users/administrateur/desktop/sqlova$ python3 train.py --trained --bert_type_abb uS

BERT-type: uncased_L-12_H-768_A-12
Batch_size = 32
BERT parameters:
learning rate: 1e-05
Fine-tune BERT: False
vocab size: 30522
hidden_size: 768
num_hidden_layer: 12
num_attention_heads: 12
hidden_act: gelu
intermediate_size: 3072
hidden_dropout_prob: 0.1
attention_probs_dropout_prob: 0.1
max_position_embeddings: 512
type_vocab_size: 2
initializer_range: 0.02
Load pre-trained parameters.
Seq-to-SQL: the number of final BERT layers to be used: 2
Seq-to-SQL: the size of hidden dimension = 100
Seq-to-SQL: LSTM encoding layer size = 2
Seq-to-SQL: dropout rate = 0.3
Seq-to-SQL: learning rate = 0.001
Traceback (most recent call last):
File "train.py", line 741, in
path_model_bert=path_model_bert, path_model=path_model)
File "train.py", line 196, in get_models
model.load_state_dict(res['model'])
File "/home/ysfmell/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 830, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Seq2SQL_v1:
size mismatch for scp.W_att.weight: copying a param with shape torch.Size([103, 105]) from checkpoint, the shape in current model is torch.Size([100, 100]).
size mismatch for scp.W_att.bias: copying a param with shape torch.Size([103]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for scp.W_c.weight: copying a param with shape torch.Size([100, 105]) from checkpoint, the shape in current model is torch.Size([100, 100]).
size mismatch for scp.W_hs.weight: copying a param with shape torch.Size([100, 103]) from checkpoint, the shape in current model is torch.Size([100, 100]).
size mismatch for sap.W_att.weight: copying a param with shape torch.Size([103, 105]) from checkpoint, the shape in current model is torch.Size([100, 100]).
size mismatch for sap.W_att.bias: copying a param with shape torch.Size([103]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for sap.sa_out.0.weight: copying a param with shape torch.Size([100, 105]) from checkpoint, the shape in current model is torch.Size([100, 100]).
size mismatch for wnp.W_att_h.weight: copying a param with shape torch.Size([1, 103]) from checkpoint, the shape in current model is torch.Size([1, 100]).
size mismatch for wnp.W_hidden.weight: copying a param with shape torch.Size([200, 103]) from checkpoint, the shape in current model is torch.Size([200, 100]).
size mismatch for wnp.W_cell.weight: copying a param with shape torch.Size([200, 103]) from checkpoint, the shape in current model is torch.Size([200, 100]).
size mismatch for wnp.W_att_n.weight: copying a param with shape torch.Size([1, 105]) from checkpoint, the shape in current model is torch.Size([1, 100]).
size mismatch for wnp.wn_out.0.weight: copying a param with shape torch.Size([100, 105]) from checkpoint, the shape in current model is torch.Size([100, 100]).
size mismatch for wcp.W_att.weight: copying a param with shape torch.Size([103, 105]) from checkpoint, the shape in current model is torch.Size([100, 100]).
size mismatch for wcp.W_att.bias: copying a param with shape torch.Size([103]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for wcp.W_c.weight: copying a param with shape torch.Size([100, 105]) from checkpoint, the shape in current model is torch.Size([100, 100]).
size mismatch for wcp.W_hs.weight: copying a param with shape torch.Size([100, 103]) from checkpoint, the shape in current model is torch.Size([100, 100]).
size mismatch for wvp.W_att.weight: copying a param with shape torch.Size([103, 105]) from checkpoint, the shape in current model is torch.Size([100, 100]).
size mismatch for wvp.W_att.bias: copying a param with shape torch.Size([103]) from checkpoint, the shape in current model is torch.Size([100]).
size mismatch for wvp.W_c.weight: copying a param with shape torch.Size([100, 105]) from checkpoint, the shape in current model is torch.Size([100, 100]).
size mismatch for wvp.W_hs.weight: copying a param with shape torch.Size([100, 103]) from checkpoint, the shape in current model is torch.Size([100, 100]).
size mismatch for wvp.wv_out.0.weight: copying a param with shape torch.Size([100, 405]) from checkpoint, the shape in current model is torch.Size([100, 400]).

ERROR when running train (test method only)

Hi,

Can you help please ??

when i try to run train.py precisaly the test method, it gives me the error below:

BERT-type: uncased_L-24_H-1024_A-16
Batch_size = 32
BERT parameters:
learning rate: 1e-05
Fine-tune BERT: False
vocab size: 30522
hidden_size: 1024
num_hidden_layer: 24
num_attention_heads: 16
hidden_act: gelu
intermediate_size: 4096
hidden_dropout_prob: 0.1
attention_probs_dropout_prob: 0.1
max_position_embeddings: 512
type_vocab_size: 2
initializer_range: 0.02
Load pre-trained parameters.
Seq-to-SQL: the number of final BERT layers to be used: 2
Seq-to-SQL: the size of hidden dimension = 100
Seq-to-SQL: LSTM encoding layer size = 2
Seq-to-SQL: dropout rate = 0.3
Seq-to-SQL: learning rate = 0.001
Traceback (most recent call last):
File "train.py", line 709, in
path_model_bert=path_model_bert, path_model=path_model)
File "train.py", line 187, in get_models
model_bert.load_state_dict(res['model_bert'])
File "/home/ysfmell/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 830, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for BertModel:
Missing key(s) in state_dict: "encoder.layer.12.attention.self.query.weight", "encoder.layer.12.attention.self.query.bias", "encoder.layer.12.attention.self.key.weight", "encoder.layer.12.attention.self.key.bias", "encoder.layer.12.attention.self.value.weight", "encoder.layer.12.attention.self.value.bias", "encoder.layer.12.attention.output.dense.weight", "encoder.layer.12.attention.output.dense.bias", "encoder.layer.12.attention.output.LayerNorm.gamma", "encoder.layer.12.attention.output.LayerNorm.beta", "encoder.layer.12.intermediate.dense.weight", "encoder.layer.12.intermediate.dense.bias", "encoder.layer.12.output.dense.weight", "encoder.layer.12.output.dense.bias", "encoder.layer.12.output.LayerNorm.gamma", "encoder.layer.12.output.LayerNorm.beta", "encoder.layer.13.attention.self.query.weight", "encoder.layer.13.attention.self.query.bias", "encoder.layer.13.attention.self.key.weight", "encoder.layer.13.attention.self.key.bias", "encoder.layer.13.attention.self.value.weight", "encoder.layer.13.attention.self.value.bias", "encoder.layer.13.attention.output.dense.weight", "encoder.layer.13.attention.output.dense.bias", "encoder.layer.13.attention.output.LayerNorm.gamma", "encoder.layer.13.attention.output.LayerNorm.beta", "encoder.layer.13.intermediate.dense.weight", "encoder.layer.13.intermediate.dense.bias", "encoder.layer.13.output.dense.weight", "encoder.layer.13.output.dense.bias", "encoder.layer.13.output.LayerNorm.gamma", "encoder.layer.13.output.LayerNorm.beta", "encoder.layer.14.attention.self.query.weight", "encoder.layer.14.attention.self.query.bias", "encoder.layer.14.attention.self.key.weight", "encoder.layer.14.attention.self.key.bias", "encoder.layer.14.attention.self.value.weight", "encoder.layer.14.attention.self.value.bias", "encoder.layer.14.attention.output.dense.weight", "encoder.layer.14.attention.output.dense.bias", "encoder.layer.14.attention.output.LayerNorm.gamma", "encoder.layer.14.attention.output.LayerNorm.beta", "encoder.layer.14.intermediate.dense.weight", "encoder.layer.14.intermediate.dense.bias", "encoder.layer.14.output.dense.weight", "encoder.layer.14.output.dense.bias", "encoder.layer.14.output.LayerNorm.gamma", "encoder.layer.14.output.LayerNorm.beta", "encoder.layer.15.attention.self.query.weight", "encoder.layer.15.attention.self.query.bias", "encoder.layer.15.attention.self.key.weight", "encoder.layer.15.attention.self.key.bias", "encoder.layer.15.attention.self.value.weight", "encoder.layer.15.attention.self.value.bias", "encoder.layer.15.attention.output.dense.weight", "encoder.layer.15.attention.output.dense.bias", "encoder.layer.15.attention.output.LayerNorm.gamma", "encoder.layer.15.attention.output.LayerNorm.beta", "encoder.layer.15.intermediate.dense.weight", "encoder.layer.15.intermediate.dense.bias", "encoder.layer.15.output.dense.weight", "encoder.layer.15.output.dense.bias", "encoder.layer.15.output.LayerNorm.gamma", "encoder.layer.15.output.LayerNorm.beta", "encoder.layer.16.attention.self.query.weight", "encoder.layer.16.attention.self.query.bias", "encoder.layer.16.attention.self.key.weight", "encoder.layer.16.attention.self.key.bias", "encoder.layer.16.attention.self.value.weight", "encoder.layer.16.attention.self.value.bias", "encoder.layer.16.attention.output.dense.weight", "encoder.layer.16.attention.output.dense.bias", "encoder.layer.16.attention.output.LayerNorm.gamma", "encoder.layer.16.attention.output.LayerNorm.beta", "encoder.layer.16.intermediate.dense.weight", "encoder.layer.16.intermediate.dense.bias", "encoder.layer.16.output.dense.weight", "encoder.layer.16.output.dense.bias", "encoder.layer.16.output.LayerNorm.gamma", "encoder.layer.16.output.LayerNorm.beta", "encoder.layer.17.attention.self.query.weight", "encoder.layer.17.attention.self.query.bias", "encoder.layer.17.attention.self.key.weight", "encoder.layer.17.attention.self.key.bias", "encoder.layer.17.attention.self.value.weight", "encoder.layer.17.attention.self.value.bias", "encoder.layer.17.attention.output.dense.weight", "encoder.layer.17.attention.output.dense.bias", "encoder.layer.17.attention.output.LayerNorm.gamma", "encoder.layer.17.attention.output.LayerNorm.beta", "encoder.layer.17.intermediate.dense.weight", "encoder.layer.17.intermediate.dense.bias", "encoder.layer.17.output.dense.weight", "encoder.layer.17.output.dense.bias", "encoder.layer.17.output.LayerNorm.gamma", "encoder.layer.17.output.LayerNorm.beta", "encoder.layer.18.attention.self.query.weight", "encoder.layer.18.attention.self.query.bias", "encoder.layer.18.attention.self.key.weight", "encoder.layer.18.attention.self.key.bias", "encoder.layer.18.attention.self.value.weight", "encoder.layer.18.attention.self.value.bias", "encoder.layer.18.attention.output.dense.weight", "encoder.layer.18.attention.output.dense.bias", "encoder.layer.18.attention.output.LayerNorm.gamma", "encoder.layer.18.attention.output.LayerNorm.beta", "encoder.layer.18.intermediate.dense.weight", "encoder.layer.18.intermediate.dense.bias", "encoder.layer.18.output.dense.weight", "encoder.layer.18.output.dense.bias", "encoder.layer.18.output.LayerNorm.gamma", "encoder.layer.18.output.LayerNorm.beta", "encoder.layer.19.attention.self.query.weight", "encoder.layer.19.attention.self.query.bias", "encoder.layer.19.attention.self.key.weight", "encoder.layer.19.attention.self.key.bias", "encoder.layer.19.attention.self.value.weight", "encoder.layer.19.attention.self.value.bias", "encoder.layer.19.attention.output.dense.weight", "encoder.layer.19.attention.output.dense.bias", "encoder.layer.19.attention.output.LayerNorm.gamma", "encoder.layer.19.attention.output.LayerNorm.beta", "encoder.layer.19.intermediate.dense.weight", "encoder.layer.19.intermediate.dense.bias", "encoder.layer.19.output.dense.weight", "encoder.layer.19.output.dense.bias", "encoder.layer.19.output.LayerNorm.gamma", "encoder.layer.19.output.LayerNorm.beta", "encoder.layer.20.attention.self.query.weight", "encoder.layer.20.attention.self.query.bias", "encoder.layer.20.attention.self.key.weight", "encoder.layer.20.attention.self.key.bias", "encoder.layer.20.attention.self.value.weight", "encoder.layer.20.attention.self.value.bias", "encoder.layer.20.attention.output.dense.weight", "encoder.layer.20.attention.output.dense.bias", "encoder.layer.20.attention.output.LayerNorm.gamma", "encoder.layer.20.attention.output.LayerNorm.beta", "encoder.layer.20.intermediate.dense.weight", "encoder.layer.20.intermediate.dense.bias", "encoder.layer.20.output.dense.weight", "encoder.layer.20.output.dense.bias", "encoder.layer.20.output.LayerNorm.gamma", "encoder.layer.20.output.LayerNorm.beta", "encoder.layer.21.attention.self.query.weight", "encoder.layer.21.attention.self.query.bias", "encoder.layer.21.attention.self.key.weight", "encoder.layer.21.attention.self.key.bias", "encoder.layer.21.attention.self.value.weight", "encoder.layer.21.attention.self.value.bias", "encoder.layer.21.attention.output.dense.weight", "encoder.layer.21.attention.output.dense.bias", "encoder.layer.21.attention.output.LayerNorm.gamma", "encoder.layer.21.attention.output.LayerNorm.beta", "encoder.layer.21.intermediate.dense.weight", "encoder.layer.21.intermediate.dense.bias", "encoder.layer.21.output.dense.weight", "encoder.layer.21.output.dense.bias", "encoder.layer.21.output.LayerNorm.gamma", "encoder.layer.21.output.LayerNorm.beta", "encoder.layer.22.attention.self.query.weight", "encoder.layer.22.attention.self.query.bias", "encoder.layer.22.attention.self.key.weight", "encoder.layer.22.attention.self.key.bias", "encoder.layer.22.attention.self.value.weight", "encoder.layer.22.attention.self.value.bias", "encoder.layer.22.attention.output.dense.weight", "encoder.layer.22.attention.output.dense.bias", "encoder.layer.22.attention.output.LayerNorm.gamma", "encoder.layer.22.attention.output.LayerNorm.beta", "encoder.layer.22.intermediate.dense.weight", "encoder.layer.22.intermediate.dense.bias", "encoder.layer.22.output.dense.weight", "encoder.layer.22.output.dense.bias", "encoder.layer.22.output.LayerNorm.gamma", "encoder.layer.22.output.LayerNorm.beta", "encoder.layer.23.attention.self.query.weight", "encoder.layer.23.attention.self.query.bias", "encoder.layer.23.attention.self.key.weight", "encoder.layer.23.attention.self.key.bias", "encoder.layer.23.attention.self.value.weight", "encoder.layer.23.attention.self.value.bias", "encoder.layer.23.attention.output.dense.weight", "encoder.layer.23.attention.output.dense.bias", "encoder.layer.23.attention.output.LayerNorm.gamma", "encoder.layer.23.attention.output.LayerNorm.beta", "encoder.layer.23.intermediate.dense.weight", "encoder.layer.23.intermediate.dense.bias", "encoder.layer.23.output.dense.weight", "encoder.layer.23.output.dense.bias", "encoder.layer.23.output.LayerNorm.gamma", "encoder.layer.23.output.LayerNorm.beta".
size mismatch for embeddings.word_embeddings.weight: copying a param with shape torch.Size([30522, 768]) from checkpoint, the shape in current model is torch.Size([30522, 1024]).
size mismatch for embeddings.position_embeddings.weight: copying a param with shape torch.Size([512, 768]) from checkpoint, the shape in current model is torch.Size([512, 1024]).
size mismatch for embeddings.token_type_embeddings.weight: copying a param with shape torch.Size([2, 768]) from checkpoint, the shape in current model is torch.Size([2, 1024]).
size mismatch for embeddings.LayerNorm.gamma: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for embeddings.LayerNorm.beta: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.0.attention.self.query.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.0.attention.self.query.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.0.attention.self.key.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.0.attention.self.key.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.0.attention.self.value.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.0.attention.self.value.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.0.attention.output.dense.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.0.attention.output.dense.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.0.attention.output.LayerNorm.gamma: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.0.attention.output.LayerNorm.beta: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.0.intermediate.dense.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for encoder.layer.0.intermediate.dense.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for encoder.layer.0.output.dense.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for encoder.layer.0.output.dense.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.0.output.LayerNorm.gamma: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.0.output.LayerNorm.beta: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.1.attention.self.query.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.1.attention.self.query.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.1.attention.self.key.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.1.attention.self.key.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.1.attention.self.value.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.1.attention.self.value.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.1.attention.output.dense.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.1.attention.output.dense.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.1.attention.output.LayerNorm.gamma: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.1.attention.output.LayerNorm.beta: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.1.intermediate.dense.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for encoder.layer.1.intermediate.dense.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for encoder.layer.1.output.dense.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for encoder.layer.1.output.dense.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.1.output.LayerNorm.gamma: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.1.output.LayerNorm.beta: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.2.attention.self.query.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.2.attention.self.query.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.2.attention.self.key.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.2.attention.self.key.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.2.attention.self.value.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.2.attention.self.value.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.2.attention.output.dense.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.2.attention.output.dense.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.2.attention.output.LayerNorm.gamma: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.2.attention.output.LayerNorm.beta: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.2.intermediate.dense.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for encoder.layer.2.intermediate.dense.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for encoder.layer.2.output.dense.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for encoder.layer.2.output.dense.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.2.output.LayerNorm.gamma: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.2.output.LayerNorm.beta: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.3.attention.self.query.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.3.attention.self.query.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.3.attention.self.key.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.3.attention.self.key.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.3.attention.self.value.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.3.attention.self.value.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.3.attention.output.dense.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.3.attention.output.dense.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.3.attention.output.LayerNorm.gamma: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.3.attention.output.LayerNorm.beta: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.3.intermediate.dense.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for encoder.layer.3.intermediate.dense.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for encoder.layer.3.output.dense.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for encoder.layer.3.output.dense.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.3.output.LayerNorm.gamma: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.3.output.LayerNorm.beta: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.4.attention.self.query.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.4.attention.self.query.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.4.attention.self.key.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.4.attention.self.key.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.4.attention.self.value.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.4.attention.self.value.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.4.attention.output.dense.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.4.attention.output.dense.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.4.attention.output.LayerNorm.gamma: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.4.attention.output.LayerNorm.beta: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.4.intermediate.dense.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for encoder.layer.4.intermediate.dense.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for encoder.layer.4.output.dense.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for encoder.layer.4.output.dense.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.4.output.LayerNorm.gamma: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.4.output.LayerNorm.beta: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.5.attention.self.query.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.5.attention.self.query.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.5.attention.self.key.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.5.attention.self.key.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.5.attention.self.value.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.5.attention.self.value.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.5.attention.output.dense.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.5.attention.output.dense.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.5.attention.output.LayerNorm.gamma: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.5.attention.output.LayerNorm.beta: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.5.intermediate.dense.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for encoder.layer.5.intermediate.dense.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for encoder.layer.5.output.dense.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for encoder.layer.5.output.dense.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.5.output.LayerNorm.gamma: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.5.output.LayerNorm.beta: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.6.attention.self.query.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.6.attention.self.query.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.6.attention.self.key.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.6.attention.self.key.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.6.attention.self.value.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.6.attention.self.value.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.6.attention.output.dense.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.6.attention.output.dense.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.6.attention.output.LayerNorm.gamma: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.6.attention.output.LayerNorm.beta: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.6.intermediate.dense.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for encoder.layer.6.intermediate.dense.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for encoder.layer.6.output.dense.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for encoder.layer.6.output.dense.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.6.output.LayerNorm.gamma: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.6.output.LayerNorm.beta: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.7.attention.self.query.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.7.attention.self.query.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.7.attention.self.key.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.7.attention.self.key.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.7.attention.self.value.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.7.attention.self.value.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.7.attention.output.dense.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.7.attention.output.dense.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.7.attention.output.LayerNorm.gamma: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.7.attention.output.LayerNorm.beta: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.7.intermediate.dense.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for encoder.layer.7.intermediate.dense.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for encoder.layer.7.output.dense.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for encoder.layer.7.output.dense.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.7.output.LayerNorm.gamma: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.7.output.LayerNorm.beta: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.8.attention.self.query.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.8.attention.self.query.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.8.attention.self.key.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.8.attention.self.key.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.8.attention.self.value.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.8.attention.self.value.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.8.attention.output.dense.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.8.attention.output.dense.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.8.attention.output.LayerNorm.gamma: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.8.attention.output.LayerNorm.beta: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.8.intermediate.dense.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for encoder.layer.8.intermediate.dense.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for encoder.layer.8.output.dense.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for encoder.layer.8.output.dense.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.8.output.LayerNorm.gamma: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.8.output.LayerNorm.beta: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.9.attention.self.query.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.9.attention.self.query.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.9.attention.self.key.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.9.attention.self.key.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.9.attention.self.value.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.9.attention.self.value.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.9.attention.output.dense.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.9.attention.output.dense.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.9.attention.output.LayerNorm.gamma: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.9.attention.output.LayerNorm.beta: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.9.intermediate.dense.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for encoder.layer.9.intermediate.dense.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for encoder.layer.9.output.dense.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for encoder.layer.9.output.dense.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.9.output.LayerNorm.gamma: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.9.output.LayerNorm.beta: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.10.attention.self.query.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.10.attention.self.query.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.10.attention.self.key.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.10.attention.self.key.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.10.attention.self.value.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.10.attention.self.value.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.10.attention.output.dense.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.10.attention.output.dense.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.10.attention.output.LayerNorm.gamma: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.10.attention.output.LayerNorm.beta: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.10.intermediate.dense.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for encoder.layer.10.intermediate.dense.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for encoder.layer.10.output.dense.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for encoder.layer.10.output.dense.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.10.output.LayerNorm.gamma: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.10.output.LayerNorm.beta: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.11.attention.self.query.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.11.attention.self.query.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.11.attention.self.key.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.11.attention.self.key.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.11.attention.self.value.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.11.attention.self.value.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.11.attention.output.dense.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for encoder.layer.11.attention.output.dense.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.11.attention.output.LayerNorm.gamma: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.11.attention.output.LayerNorm.beta: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.11.intermediate.dense.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for encoder.layer.11.intermediate.dense.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for encoder.layer.11.output.dense.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for encoder.layer.11.output.dense.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.11.output.LayerNorm.gamma: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.layer.11.output.LayerNorm.beta: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for pooler.dense.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for pooler.dense.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).

《leveraging table content for zero-shot text-to-sql with meta-learning》

Half-same idea but accepted.
Good example to learn.

bertindex_knowledge and header_knowledge

i'm struggling how to annotate bertindex_knowledge and header_knowledge, can you explain them? thanks

Provide better instructions to run

Hi,

Downloaded and tried to run as per current documentation but could get anything to run 'easily'
Can you provide better instructions / documentation on how to run on a fresh system?

i.e what libs i need etc

a requirements.txt ?

the requirements you listed obviously are not complete.

Implemented train for testing on my dataset.

corenlp.client.PermanentlyFailedException: Timed out waiting for service to come alive.
The model asked for the Type question, when I typed question and the above error pops up?
Do you know why this might be a problem? Please help.

PS: Calling the def infer function, Changed the args like --do_train to False --do_infer to True --infer_loop to True --EG to True and in the def infer block changed the args show_table to True and show_anwer_only to True.

Thanks
Bill

Testing

How do I test for a custom CSV and text for generating SQL queries?

"wvi_corenlp" ?

How are you getting the position of the keyword and how are you determining that the particular word is the keyword?
wvi_corenlp - Please describe about this.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.