The tp-berta from jyansir

cannot find 'feature_names.json'

Hi, everyone:
I run project, cannot find 'feature_names.json' file.

D:\conda_env\py38_tpberta\python.exe E:\project\tp-berta\scripts\finetune\default\run_default_config_tree.py --task=regression
load data config from: E:\project\tp-berta\scripts\finetune\default\checkpoints\tp-bin
Traceback (most recent call last):
File "E:\project\tp-berta\scripts\finetune\default\run_default_config_tree.py", line 85, in
data_config = DataConfig.from_pretrained(
File "E:\project\tp-berta\lib\data_utils.py", line 93, in from_pretrained
return DataConfig.from_config(config, tokenizer)
File "E:\project\tp-berta\lib\data_utils.py", line 116, in from_config
with open(data_dir /feature_map_file, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'E:\project\tp-berta\scripts\finetune\default\data\finetune-reg\feature_names.json'

Assertion Error on Fine-tuning

Hello everyone,

I am trying to fine tune tp-berta on the tabfact dataset from hugging face. So far, I have:

downloaded the csv file and put it in the data/finetune-bin directory;
downloaded the pre-trained model (on a single task) and extracted it in a /checkpoints directory
run the command:
python scripts/clean_feat_names.py --mode "finetune" --task "binclass"
successfully

But when I run:
python scripts/finetune/default/run_default_config_tpberta.py --dataset "tab_fact" --task "binclass"

I get this assertionError:

load data config from: /content/tp-berta/checkpoints/tp-bin
prepare datasets #(1/1) [tab_fact]Using cached features: lm/cache__tab_fact__0__quantile__None__None__None__None__default__e8f00a71fbb1bc9a606001064edbb78c.pickle
100% 1/1 [00:02<00:00, 2.49s/it]
Token indices sequence length is longer than the specified maximum sequence length for this model (726 > 512). Running this sequence through the model will result in indexing errors
Some weights of the model checkpoint at /content/tp-berta/checkpoints/tp-bin/pytorch_models/best were not used when initializing TPBertaForClassification: ['heads.heads.76.dense.weight', 'heads.heads.44.dense.bias', ...]

This IS expected if you are initializing TPBertaForClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing TPBertaForClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of TPBertaForClassification were not initialized from the model checkpoint at /content/tp-berta/checkpoints/tp-bin/pytorch_models/best and are newly initialized: ['classifier.out_proj.bias', 'classifier.out_proj.weight', 'classifier.dense.weight', 'classifier.dense.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Finetuning: 0% 0/200 [00:00<?, ?it/s]
epoch-0: 0% 0/923 [00:00<?, ?it/s]
Finetuning: 0% 0/200 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/content/tp-berta/scripts/finetune/default/run_default_config_tpberta.py", line 154, in
logits, _ = model(**batch)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/content/tp-berta/project_bin/tpberta_modeling.py", line 458, in forward
outputs = self.tpberta(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/content/tp-berta/project_bin/tpberta_modeling.py", line 231, in forward
feature_chunk = self.intra_attention(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/content/tp-berta/project_bin/tpberta_modeling.py", line 115, in forward
assert f % b == 0
AssertionError

I have tried to manually fix the data_loader but so far no luck. Any ideas?

fine-tuing and predict

I've fine-tuned on my dataset. How do I predict to get the binary results? I don't find the file.

jyansir / tp-berta Goto Github PK

tp-berta's People

Contributors

Stargazers

Watchers

Forkers

tp-berta's Issues

cannot find 'feature_names.json'

Assertion Error on Fine-tuning

fine-tuing and predict

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent