Git Product home page Git Product logo

Comments (15)

tianzhaohaha avatar tianzhaohaha commented on September 23, 2024

Also, many model's path was added the hugging_cache/, is it means I should download the corresponding model by myself?

from easyedit.

tbozhong avatar tbozhong commented on September 23, 2024

Thanks for your attention to our work!
You can see detailed descriptions of hparams here.
The path with prefix hugging_cache is just an example; please download the mentioned model from Hugging Face or relevant repositories on your own (e.g. MiniGPT-4 and LAVIS for BLIP2).

from easyedit.

tianzhaohaha avatar tianzhaohaha commented on September 23, 2024

Thanks for the reply. But sorry I am still little confused about these hparams. Could you kindly give me a example for MEND/minigpt4.yaml I think it just need a little change?

from easyedit.

tianzhaohaha avatar tianzhaohaha commented on September 23, 2024

Too many confused hparams, this is my current minigpt4.yaml, could someone help me fix it?

Model

device: 0
alg_name: "MEND"
name: lmsys/vicuna-7b-v1.5
model_name: minigpt4
model_class: Blip2OPT
tokenizer_class: LlamaTokenizer
tokenizer_name: lmsys/vicuna-7b-v1.5
inner_params:

  • llama_model.model.layers.29.mlp.down_proj.weight
  • llama_model.model.layers.29.mlp.up_proj.weight
  • llama_model.model.layers.30.mlp.down_proj.weight
  • llama_model.model.layers.30.mlp.up_proj.weight
  • llama_model.model.layers.31.mlp.down_proj.weight
  • llama_model.model.layers.31.mlp.up_proj.weight

Method

alg: MEND
lr: 1e-6
edit_lr: 1e-4
lr_lr: 1e-4
lr_scale: 1.0
seed: 42
cedit: 0.1
iedit: 0.1
cloc: 1.0
cbase: 1.0
dropout: 0.0
train_base: False
no_grad_layers: null
one_sided: False
n_hidden: 1
hidden_dim: null
init: id
norm: True
combine: True
x_only: False
delta_only: False
act: relu
rank: 1920
mlp_class: IDMLP
shared: True
archive: results/models/MEND/minigpt4-vqa

Train

batch_size: 1
model_save_pt: 5000
silent: False
#max_epochs: 1
max_iters: 50000
log_interval: 100
eval_log_interval: 1000
final_eval: True
val_interval: 5000
early_stop_patience: 20000
early_stop_key: "loss/total_edit_val"
eval_only: True
half: False
debug: False
save: False
verbose: True

val_batch_size: 1
accumulate_bs: 2
val_steps: 500 # only for debug
opt: Adam
grad_clip: 100.

Output

results_dir: ./results

Multimodal

qformer_checkpoint: hugging_cache/pretrained_minigpt4_llama2_7b.pth
qformer_name_or_path: bert-base-uncased
state_dict_file: hugging_cache/eva_vit_g.pth
pretrained_ckpt: hugging_cache/pretrained_minigpt4_llama2_7b.pth

image

coco_image: ../
rephrase_image: ../

now, I got the Error: RuntimeError: Error(s) in loading state_dict for MiniGPT4:
size mismatch for llama_proj.weight: copying a param with shape torch.Size([4096, 5632]) from checkpoint, the shape in current model is torch.Size([4096, 768]).

from easyedit.

tianzhaohaha avatar tianzhaohaha commented on September 23, 2024

I think the name and the tokenizer_name may be not correct?

from easyedit.

tbozhong avatar tbozhong commented on September 23, 2024

You have incorrectly configured the qformer_checkpoint and pretrained_ckpt settings, deviating from the original repository's guidelines. Please refer to the Multimodal section in this file for the correct settings.

Feel free to specify any points of confusion so that we can optimize and provide clearer guidance in the future.

from easyedit.

tbozhong avatar tbozhong commented on September 23, 2024

You can obtain the pretrained_ckpt by downloading it from here, and for the qformer_checkpoint, you can find it here.

For more detailed information, you can refer to the code in MiniGPT-4.

from easyedit.

tianzhaohaha avatar tianzhaohaha commented on September 23, 2024

Really appreciate for your clarification!

from easyedit.

tianzhaohaha avatar tianzhaohaha commented on September 23, 2024

Another question, what is the qformer_checkpoint: hugging_cache/blip2_pretrained_opt2.7b.pth in blip2.yaml? Do I need to run Blip2 first to get the corresponding pre-trained model before I can run your code? Actually. I try to save the blip2 as pth file but your code said format mismatch like KeyError: 'model'

from easyedit.

tbozhong avatar tbozhong commented on September 23, 2024

Thank you for providing additional information. The correct link for downloading the qformer_checkpoint is here according to the source code from this file.

from easyedit.

tianzhaohaha avatar tianzhaohaha commented on September 23, 2024

Thank you for your patient guidance. I am new to this repo, the cost of reproducing your code is too high for me. Could you please just provide the correct yaml file(not example) with coresponding models? You code is not support a wide range of models at least on multimodel part, so I don't think it's a good idea to ask researchers themselves to find these models.(Just an advise)

Still in Error...
RuntimeError: Error(s) in loading state_dict for Blip2OPT:
size mismatch for opt_proj.weight: copying a param with shape torch.Size([2560, 768]) from checkpoint, the shape in current model is torch.Size([768, 768]).
size mismatch for opt_proj.bias: copying a param with shape torch.Size([2560]) from checkpoint, the shape in current model is torch.Size([768]).

Bellow is my blip2.yaml:

Model

device: 1
alg_name: "MEND"
name: Salesforce/blip2-opt-2.7b
model_name: blip2
model_class: Blip2OPT
tokenizer_class: GPT2Tokenizer
tokenizer_name: Salesforce/blip2-opt-2.7b
inner_params:

  • opt_model.model.decoder.layers.29.fc1.weight
  • opt_model.model.decoder.layers.29.fc2.weight
  • opt_model.model.decoder.layers.30.fc1.weight
  • opt_model.model.decoder.layers.30.fc2.weight
  • opt_model.model.decoder.layers.31.fc1.weight
  • opt_model.model.decoder.layers.31.fc2.weight

Method

alg: MEND
lr: 1e-6
edit_lr: 1e-4
lr_lr: 1e-4
lr_scale: 1.0
seed: 42
cedit: 0.1
iedit: 0.1
cloc: 1.0
cbase: 1.0
dropout: 0.0
train_base: False
no_grad_layers: null
one_sided: False
n_hidden: 1
hidden_dim: null
init: id
norm: True
combine: True
x_only: False
delta_only: False
act: relu
rank: 1920
mlp_class: IDMLP
shared: True
archive: results/models/MEND/blip2

Train

batch_size: 1
model_save_pt: 5000
silent: False
#max_epochs: 1
max_iters: 50000
log_interval: 100
eval_log_interval: 1000
final_eval: True
val_interval: 5000
early_stop_patience: 20000
early_stop_key: "loss/total_edit_val"
eval_only: True
half: False
debug: False
save: False
verbose: True

val_batch_size: 1
accumulate_bs: 2
val_steps: 500 # only for debug
opt: Adam
grad_clip: 100.

Output

results_dir: ./results

Multimodal

qformer_checkpoint: hugging_cache/blip2_pretrained_opt2.7b.pth
qformer_name_or_path: bert-base-uncased
state_dict_file: hugging_cache/eva_vit_g.pth

image

coco_image: ../
rephrase_image: ../

from easyedit.

tbozhong avatar tbozhong commented on September 23, 2024

Thank you for your feedback.

You can follow the config file where model_name and tokenizer_name use opt-2.7b instead of blip2-opt-2.7b. And I guess you didn't run a trainer, so you should configure MEND following hparams/TRAINING/MEND/blip2.yaml and refer to the example of using it provided here.

from easyedit.

zxlzr avatar zxlzr commented on September 23, 2024

Hi, have you solved your issue yet?

from easyedit.

tianzhaohaha avatar tianzhaohaha commented on September 23, 2024

Actually, I just want to run your example: EasyEdit_Example_Multimodal_IKE.ipynb. Still, I dont know which opt-2.7b should I set, hugging_cache is just a empty folder, so I think I should download the model from huggingface first for the model_name and tokenizer_name. If I set hugging_cache/opt-2.7b, then it will be error:
OSError: hugging_cache/opt-2.7b is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo with use_auth_token or log in with huggingface-cli login and pass use_auth_token=True.

from easyedit.

tbozhong avatar tbozhong commented on September 23, 2024

Thank you for the clarification. You can set model_name and tokenizer_name as facebook/opt-2.7b for convenience, and I'll take note of the hugging_cache as our local folder for manually downloaded models from Hugging Face.

Certainly! If you have any more questions or need further assistance, I can reach out to you on WeChat using the provided username YouKn0wWho for convenient communication.

from easyedit.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.