Comments (15)
Also, many model's path was added the hugging_cache/, is it means I should download the corresponding model by myself?
from easyedit.
Thanks for your attention to our work!
You can see detailed descriptions of hparams here.
The path with prefix hugging_cache
is just an example; please download the mentioned model from Hugging Face or relevant repositories on your own (e.g. MiniGPT-4 and LAVIS for BLIP2).
from easyedit.
Thanks for the reply. But sorry I am still little confused about these hparams. Could you kindly give me a example for MEND/minigpt4.yaml
I think it just need a little change?
from easyedit.
Too many confused hparams, this is my current minigpt4.yaml, could someone help me fix it?
Model
device: 0
alg_name: "MEND"
name: lmsys/vicuna-7b-v1.5
model_name: minigpt4
model_class: Blip2OPT
tokenizer_class: LlamaTokenizer
tokenizer_name: lmsys/vicuna-7b-v1.5
inner_params:
- llama_model.model.layers.29.mlp.down_proj.weight
- llama_model.model.layers.29.mlp.up_proj.weight
- llama_model.model.layers.30.mlp.down_proj.weight
- llama_model.model.layers.30.mlp.up_proj.weight
- llama_model.model.layers.31.mlp.down_proj.weight
- llama_model.model.layers.31.mlp.up_proj.weight
Method
alg: MEND
lr: 1e-6
edit_lr: 1e-4
lr_lr: 1e-4
lr_scale: 1.0
seed: 42
cedit: 0.1
iedit: 0.1
cloc: 1.0
cbase: 1.0
dropout: 0.0
train_base: False
no_grad_layers: null
one_sided: False
n_hidden: 1
hidden_dim: null
init: id
norm: True
combine: True
x_only: False
delta_only: False
act: relu
rank: 1920
mlp_class: IDMLP
shared: True
archive: results/models/MEND/minigpt4-vqa
Train
batch_size: 1
model_save_pt: 5000
silent: False
#max_epochs: 1
max_iters: 50000
log_interval: 100
eval_log_interval: 1000
final_eval: True
val_interval: 5000
early_stop_patience: 20000
early_stop_key: "loss/total_edit_val"
eval_only: True
half: False
debug: False
save: False
verbose: True
val_batch_size: 1
accumulate_bs: 2
val_steps: 500 # only for debug
opt: Adam
grad_clip: 100.
Output
results_dir: ./results
Multimodal
qformer_checkpoint: hugging_cache/pretrained_minigpt4_llama2_7b.pth
qformer_name_or_path: bert-base-uncased
state_dict_file: hugging_cache/eva_vit_g.pth
pretrained_ckpt: hugging_cache/pretrained_minigpt4_llama2_7b.pth
image
coco_image: ../
rephrase_image: ../
now, I got the Error: RuntimeError: Error(s) in loading state_dict for MiniGPT4:
size mismatch for llama_proj.weight: copying a param with shape torch.Size([4096, 5632]) from checkpoint, the shape in current model is torch.Size([4096, 768]).
from easyedit.
I think the name and the tokenizer_name may be not correct?
from easyedit.
You have incorrectly configured the qformer_checkpoint
and pretrained_ckpt
settings, deviating from the original repository's guidelines. Please refer to the Multimodal section in this file for the correct settings.
Feel free to specify any points of confusion so that we can optimize and provide clearer guidance in the future.
from easyedit.
You can obtain the pretrained_ckpt
by downloading it from here, and for the qformer_checkpoint
, you can find it here.
For more detailed information, you can refer to the code in MiniGPT-4.
from easyedit.
Really appreciate for your clarification!
from easyedit.
Another question, what is the qformer_checkpoint: hugging_cache/blip2_pretrained_opt2.7b.pth
in blip2.yaml
? Do I need to run Blip2 first to get the corresponding pre-trained model before I can run your code? Actually. I try to save the blip2 as pth file but your code said format mismatch like KeyError: 'model'
from easyedit.
Thank you for providing additional information. The correct link for downloading the qformer_checkpoint
is here according to the source code from this file.
from easyedit.
Thank you for your patient guidance. I am new to this repo, the cost of reproducing your code is too high for me. Could you please just provide the correct yaml file(not example) with coresponding models? You code is not support a wide range of models at least on multimodel part, so I don't think it's a good idea to ask researchers themselves to find these models.(Just an advise)
Still in Error...
RuntimeError: Error(s) in loading state_dict for Blip2OPT:
size mismatch for opt_proj.weight: copying a param with shape torch.Size([2560, 768]) from checkpoint, the shape in current model is torch.Size([768, 768]).
size mismatch for opt_proj.bias: copying a param with shape torch.Size([2560]) from checkpoint, the shape in current model is torch.Size([768]).
Bellow is my blip2.yaml:
Model
device: 1
alg_name: "MEND"
name: Salesforce/blip2-opt-2.7b
model_name: blip2
model_class: Blip2OPT
tokenizer_class: GPT2Tokenizer
tokenizer_name: Salesforce/blip2-opt-2.7b
inner_params:
- opt_model.model.decoder.layers.29.fc1.weight
- opt_model.model.decoder.layers.29.fc2.weight
- opt_model.model.decoder.layers.30.fc1.weight
- opt_model.model.decoder.layers.30.fc2.weight
- opt_model.model.decoder.layers.31.fc1.weight
- opt_model.model.decoder.layers.31.fc2.weight
Method
alg: MEND
lr: 1e-6
edit_lr: 1e-4
lr_lr: 1e-4
lr_scale: 1.0
seed: 42
cedit: 0.1
iedit: 0.1
cloc: 1.0
cbase: 1.0
dropout: 0.0
train_base: False
no_grad_layers: null
one_sided: False
n_hidden: 1
hidden_dim: null
init: id
norm: True
combine: True
x_only: False
delta_only: False
act: relu
rank: 1920
mlp_class: IDMLP
shared: True
archive: results/models/MEND/blip2
Train
batch_size: 1
model_save_pt: 5000
silent: False
#max_epochs: 1
max_iters: 50000
log_interval: 100
eval_log_interval: 1000
final_eval: True
val_interval: 5000
early_stop_patience: 20000
early_stop_key: "loss/total_edit_val"
eval_only: True
half: False
debug: False
save: False
verbose: True
val_batch_size: 1
accumulate_bs: 2
val_steps: 500 # only for debug
opt: Adam
grad_clip: 100.
Output
results_dir: ./results
Multimodal
qformer_checkpoint: hugging_cache/blip2_pretrained_opt2.7b.pth
qformer_name_or_path: bert-base-uncased
state_dict_file: hugging_cache/eva_vit_g.pth
image
coco_image: ../
rephrase_image: ../
from easyedit.
Thank you for your feedback.
You can follow the config file where model_name
and tokenizer_name
use opt-2.7b
instead of blip2-opt-2.7b
. And I guess you didn't run a trainer, so you should configure MEND
following hparams/TRAINING/MEND/blip2.yaml
and refer to the example of using it provided here.
from easyedit.
Hi, have you solved your issue yet?
from easyedit.
Actually, I just want to run your example: EasyEdit_Example_Multimodal_IKE.ipynb. Still, I dont know which opt-2.7b should I set, hugging_cache is just a empty folder, so I think I should download the model from huggingface first for the model_name and tokenizer_name. If I set hugging_cache/opt-2.7b, then it will be error:
OSError: hugging_cache/opt-2.7b is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo with use_auth_token
or log in with huggingface-cli login
and pass use_auth_token=True
.
from easyedit.
Thank you for the clarification. You can set model_name
and tokenizer_name
as facebook/opt-2.7b
for convenience, and I'll take note of the hugging_cache
as our local folder for manually downloaded models from Hugging Face.
Certainly! If you have any more questions or need further assistance, I can reach out to you on WeChat using the provided username YouKn0wWho
for convenient communication.
from easyedit.
Related Issues (20)
- Error when runing function test_IKE_Blip2OPT_VQA() in multimodal_edit.py HOT 3
- Error when Running IKE for Wiki Counterfactual dataset HOT 10
- Builder config error when running MEMIT HOT 9
- Differences between ft_main.py and lora_main.py HOT 2
- Full datasets loading fix HOT 1
- There is a question about 'edited_model' in easyeditor/editor.py, why edit() function always return the last edited_model only? HOT 5
- Model edition does not take effects HOT 8
- 请问是否可以设置随机种子? HOT 8
- How to apply it to new models HOT 5
- Request for 'Editing GPU memory usage' update HOT 2
- MEMIT方法是否能使每次的权重更新相同(在代码完全一致的情况下) HOT 6
- Problems about the results of GRACE method on Llama-2-7b HOT 14
- 请问一下论文中提到的opencompass在评估时要用到吗 HOT 2
- T-Patcher support HOT 2
- Question about E-VQA HOT 8
- Could there be a bug in the FT implementation HOT 6
- MELO fails to edit gpt2-xl HOT 7
- it seems there is an import bug HOT 5
- error when runing knowedit HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from easyedit.