Comments (23)
Thanks for sharing. This looks correct so far: When saving the adapter with PEFT, the adapter name is being removed from the key, so e.g. when the adapter name is "default"
(which is the default), foo.layers.0.self_attn.q_proj.lora_A.default.weight
would become foo.layers.0.self_attn.q_proj.lora_A.weight
. I'm not 100% sure why it's removed -- probably it's so that we can load the adapter with a different adapter name later, but whatever the reason, that's what happens. In the key names you showed, there is no adapter name, so this is correct.
Later, when we load the adapter, we have to inject the adapter name back into the key, which is happening in the code snippet cited above. Looking through the code, I don't see what could go wrong for the adapter name to be injected twice, so that we end up with base_model.model.model.layers.0.self_attn.q_proj.lora_A.default.default.weight
. I thought that maybe the adapter name was not properly removed in the stored adapter file, but as you've shown, that's not the case. Ideally, if you can somehow create a dummy adapter that causes this issue, without any weights trained on your data, and share it as a safetensors file, I could do further debugging.
I think that I should re-train base model with LoRA config, re-convert lora adapter to safetensors, and re-load adapter and re-merge it with base model.
If that's not too much effort for you, this could certainly be a solution. I would certainly start with very little data and ensure that this time around, loading the model works, before spending too much time training.
Alternatively, what you could try to do is to modify the PEFT code a little bit so that the double adapter name is removed. So e.g. in this line, add the following snippet:
peft_model_state_dict = {k.replace("default.default", "default"): v for k, v in peft_model_state_dict.items()}
It's very blunt, but it would be interesting to see if it solves the problem.
from peft.
I appreciate your kind explanation and helpful solution!
I'll try and those jobs won't take so long(with small epochs and small data size for training).
I probably made a mistake on previous training. haha
I hope that I will be able to show you the reason later. Thanks!
from peft.
PEFT should normally handle the prefix correctly. What kind of adapter are you using? Can you share it (ideally as a safetensors file) so that we can try to reproduce? I assume that if you load the saved model, it's not working as expected because the adapter is missing?
from peft.
PEFT should normally handle the prefix correctly. What kind of adapter are you using? Can you share it (ideally as a safetensors file) so that we can try to reproduce? I assume that if you load the saved model, it's not working as expected because the adapter is missing?
Yes, the saved model's weight is same with the original model. You can download my adapter from the below url, I use a lora method and save it as a adapter_model.bin.
https://drive.google.com/file/d/15tWQGR9Imrk5lKTaYRMKU70yl4_VdBNF/view?usp=drive_link
from peft.
Thanks for the link. I converted the file to safetensors and then loaded it. For me, it worked correctly:
>>> model = AutoPeftModelForCausalLM.from_pretrained(<path>, device_map="cpu")
>>> print(model)
PeftModelForCausalLM(
(base_model): LoraModel(
(model): Qwen2ForCausalLM(
(model): Qwen2Model(
(embed_tokens): Embedding(151646, 2048)
(layers): ModuleList(
(0-23): 24 x Qwen2DecoderLayer(
(self_attn): Qwen2SdpaAttention(
(q_proj): lora.Linear(
(base_layer): Linear(in_features=2048, out_features=2048, bias=True)
(lora_dropout): ModuleDict(
(default): Dropout(p=0.1, inplace=False)
)
(lora_A): ModuleDict(
(default): Linear(in_features=2048, out_features=16, bias=False) # <= LoRA adapter
)
(lora_B): ModuleDict(
(default): Linear(in_features=16, out_features=2048, bias=False) # <= LoRA adapter
)
(lora_embedding_A): ParameterDict()
(lora_embedding_B): ParameterDict()
)
etc.
>>> len([m for m in model.modules() if isinstance(m, peft.tuners.lora.LoraLayer)])
168
How exactly did you determine that the model was the same as previously?
from peft.
Thanks for the link. I converted the file to safetensors and then loaded it. For me, it worked correctly:
>>> model = AutoPeftModelForCausalLM.from_pretrained(<path>, device_map="cpu") >>> print(model) PeftModelForCausalLM( (base_model): LoraModel( (model): Qwen2ForCausalLM( (model): Qwen2Model( (embed_tokens): Embedding(151646, 2048) (layers): ModuleList( (0-23): 24 x Qwen2DecoderLayer( (self_attn): Qwen2SdpaAttention( (q_proj): lora.Linear( (base_layer): Linear(in_features=2048, out_features=2048, bias=True) (lora_dropout): ModuleDict( (default): Dropout(p=0.1, inplace=False) ) (lora_A): ModuleDict( (default): Linear(in_features=2048, out_features=16, bias=False) # <= LoRA adapter ) (lora_B): ModuleDict( (default): Linear(in_features=16, out_features=2048, bias=False) # <= LoRA adapter ) (lora_embedding_A): ParameterDict() (lora_embedding_B): ParameterDict() ) etc. >>> len([m for m in model.modules() if isinstance(m, peft.tuners.lora.LoraLayer)]) 168How exactly did you determine that the model was the same as previously?
Yes, this adapter can be correctly loaded, but after I run "model.merge_and_unload()", the merged model's weights is same as previously. I check it by print the values of each layer's weight.
from peft.
Yes, the reason appears to be that the LoRA adapter was not trained. All the lora_B
weights are 0, therefore LoRA is a no-op and doesn't change the weights:
>>> all((module.weight == 0.0).all() for name, module in model.named_modules() if "lora_B.default" in name)
True
You should train the LoRA adapter correctly first.
from peft.
Yes, the reason appears to be that the LoRA adapter was not trained. All the
lora_B
weights are 0, therefore LoRA is a no-op and doesn't change the weights:>>> all((module.weight == 0.0).all() for name, module in model.named_modules() if "lora_B.default" in name) TrueYou should train the LoRA adapter correctly first.
Emmm, but the lora adapter has been trained and I have checked the weights is not zero in the adapter.bin file, is there some error in converting it to safetensor?
from peft.
Oh, very strange. I used this space to convert it but maybe I did something wrong. Could you please upload a safetensors file, so that I can check it out? If you do save_pretrained
with a recent version of PEFT, it should default to safetensors automatically.
from peft.
Oh, very strange. I used this space to convert it but maybe I did something wrong. Could you please upload a safetensors file, so that I can check it out? If you do
save_pretrained
with a recent version of PEFT, it should default to safetensors automatically.
Here:
https://drive.google.com/file/d/1CQI7UCV-zBTNBK4asHl-UUR1Lqz72JAd/view?usp=sharing
from peft.
Thanks for uploading a safetensors version. Your zip file seems to contain the same checkpoint twice, but they appear to be identical, I tried both. I still found that lora_B
is all zeros:
>>> all((module.weight == 0.0).all() for name, module in model.named_modules() if "lora_B.default" in name)
True
>>> model.base_model.model.model.layers[0].self_attn.q_proj.lora_B["default"].weight
Parameter containing:
tensor([[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]])
Could you please verify if you find the same?
from peft.
Thanks for uploading a safetensors version. Your zip file seems to contain the same checkpoint twice, but they appear to be identical, I tried both. I still found that
lora_B
is all zeros:>>> all((module.weight == 0.0).all() for name, module in model.named_modules() if "lora_B.default" in name) True >>> model.base_model.model.model.layers[0].self_attn.q_proj.lora_B["default"].weight Parameter containing: tensor([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]])Could you please verify if you find the same?
I try to use safetensors to load the adapter_model.safetensors, and find it is not all zeros. Here is my code:
from safetensors import safe_open
with safe_open('adapter_model.safetensors',framework="pt", device='cpu') as f:
tensors = {}
for k in f.keys():
tensors[k] = f.get_tensor(k)
print(tensors['base_model.model.layers.0.self_attn.q_proj.lora_B.weight'])
tensor([[-0.0747, -0.0737, -0.0747, ..., 0.0762, -0.0742, -0.0737],
[-0.0391, -0.0400, -0.0400, ..., 0.0415, -0.0396, -0.0371],
[-0.0544, -0.0535, -0.0557, ..., 0.0571, -0.0549, -0.0520],
...,
[-0.0034, -0.0036, -0.0023, ..., 0.0069, -0.0031, -0.0054],
[-0.0042, -0.0039, -0.0055, ..., 0.0023, -0.0058, -0.0034],
[-0.0114, -0.0098, -0.0100, ..., 0.0110, -0.0115, -0.0120]],
dtype=torch.bfloat16)
Besides that, I have confirmed that this peft adapter can be accurately utilized in downstream task when it is loaded using the AutoModel.load_adapter() method. However, an issue arises when attempting to use it with the merged model. So I believe that the error is from the merging process.
from peft.
Okay, so the reason seems to be that there's a mismatch between the keys found in the adapter and the keys expected by the model. When I jump into this line and check the keys, I can see that they all mismatch:
>>> keys_found = sorted(adapters_weights.keys())
>>> keys_expected = sorted(self.state_dict())
>>> s0 = set(keys_expected)
>>> s1 = set(keys_found)
>>> len(s0), len(s1)
(627, 448)
>>> len(s0 - s1)
627
>>> len(s1 - s0)
448
>>> pp keys_found[:10]
['base_model.model.layers.0.mlp.down_proj.lora_A.weight',
'base_model.model.layers.0.mlp.down_proj.lora_B.weight',
'base_model.model.layers.0.mlp.gate_proj.lora_A.weight',
'base_model.model.layers.0.mlp.gate_proj.lora_B.weight',
'base_model.model.layers.0.mlp.up_proj.lora_A.weight',
'base_model.model.layers.0.mlp.up_proj.lora_B.weight',
'base_model.model.layers.0.self_attn.k_proj.lora_A.weight',
'base_model.model.layers.0.self_attn.k_proj.lora_B.weight',
'base_model.model.layers.0.self_attn.o_proj.lora_A.weight',
'base_model.model.layers.0.self_attn.o_proj.lora_B.weight']
>>> pp keys_expected[:10]
['base_model.model.lm_head.weight',
'base_model.model.model.embed_tokens.weight',
'base_model.model.model.layers.0.input_layernorm.weight',
'base_model.model.model.layers.0.mlp.down_proj.base_layer.weight',
'base_model.model.model.layers.0.mlp.down_proj.lora_A.default.weight',
'base_model.model.model.layers.0.mlp.down_proj.lora_B.default.weight',
'base_model.model.model.layers.0.mlp.gate_proj.base_layer.weight',
'base_model.model.model.layers.0.mlp.gate_proj.lora_A.default.weight',
'base_model.model.model.layers.0.mlp.gate_proj.lora_B.default.weight',
'base_model.model.model.layers.0.mlp.up_proj.base_layer.weight']
When creating a fresh model, I can confirm that the latter is the correct format for this model. My adapter_model.safetensors
also only contained 336 entries, not 448 as in your file.
I'm not sure what exactly happened that causes the adapter you have to use a different format, maybe there was a change in the version of PEFT or transformers between creating the adapter and loading it?
To ensure that there is no bug in PEFT, I confirmed that it's possible to save and load an adapter with qwen:
>>> from transformers import AutoModelForCausalLM
>>> from peft import get_peft_model, LoraConfig, PeftModel
>>> base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen1.5-1.8B")
>>> peft_model = get_peft_model(base_model, LoraConfig(target_modules=["up_proj", "q_proj", "down_proj", "k_proj", "o_proj", "gate_proj", "v_proj"], init_lora_weights=False))
>>> peft_model.base_model.model.model.layers[0].self_attn.q_proj.lora_A["default"].weight
Parameter containing:
tensor([[ 0.0061, 0.0040, -0.0037, ..., 0.0030, -0.0103, -0.0040],
[-0.0190, 0.0183, 0.0137, ..., -0.0065, 0.0063, -0.0156],
[ 0.0170, 0.0203, 0.0184, ..., -0.0191, 0.0132, -0.0176],
...,
[-0.0091, 0.0197, -0.0063, ..., -0.0170, -0.0003, 0.0013],
[ 0.0135, 0.0209, -0.0040, ..., -0.0119, 0.0159, 0.0164],
[ 0.0003, 0.0220, -0.0092, ..., 0.0070, 0.0012, 0.0212]],
requires_grad=True)
>>> peft_model.base_model.model.model.layers[0].self_attn.q_proj.lora_B["default"].weight
Parameter containing:
tensor([[-0.3462, -0.1964, 0.0248, ..., 0.1943, -0.1583, -0.2640],
[-0.0430, 0.3114, 0.1676, ..., -0.0210, 0.1741, 0.2173],
[ 0.0789, 0.2819, -0.1108, ..., -0.1683, 0.1381, -0.3278],
...,
[ 0.1441, -0.0852, 0.2126, ..., -0.0384, -0.1946, 0.3313],
[-0.2722, 0.2995, 0.2065, ..., 0.0393, -0.2830, 0.3083],
[ 0.0508, 0.2045, 0.0730, ..., 0.1732, 0.3274, 0.0733]],
requires_grad=True)
>>> peft_model.save_pretrained("/tmp/peft/qwen")
>>> del base_model
>>> del peft_model
>>> base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen1.5-1.8B")
>>> peft_model = PeftModel.from_pretrained(base_model, "/tmp/peft/qwen")
>>> peft_model.base_model.model.model.layers[0].self_attn.q_proj.lora_A["default"].weight
Parameter containing:
tensor([[ 0.0061, 0.0040, -0.0037, ..., 0.0030, -0.0103, -0.0040],
[-0.0190, 0.0183, 0.0137, ..., -0.0065, 0.0063, -0.0156],
[ 0.0170, 0.0203, 0.0184, ..., -0.0191, 0.0132, -0.0176],
...,
[-0.0091, 0.0197, -0.0063, ..., -0.0170, -0.0003, 0.0013],
[ 0.0135, 0.0209, -0.0040, ..., -0.0119, 0.0159, 0.0164],
[ 0.0003, 0.0220, -0.0092, ..., 0.0070, 0.0012, 0.0212]])
>>> peft_model.base_model.model.model.layers[0].self_attn.q_proj.lora_B["default"].weight
Parameter containing:
tensor([[-0.3462, -0.1964, 0.0248, ..., 0.1943, -0.1583, -0.2640],
[-0.0430, 0.3114, 0.1676, ..., -0.0210, 0.1741, 0.2173],
[ 0.0789, 0.2819, -0.1108, ..., -0.1683, 0.1381, -0.3278],
...,
[ 0.1441, -0.0852, 0.2126, ..., -0.0384, -0.1946, 0.3313],
[-0.2722, 0.2995, 0.2065, ..., 0.0393, -0.2830, 0.3083],
[ 0.0508, 0.2045, 0.0730, ..., 0.1732, 0.3274, 0.0733]])
It would probably be possibly to salvage this adapter by remapping the keys from its state_dict
to what's actually expected.
from peft.
Thanks for uploading a safetensors version. Your zip file seems to contain the same checkpoint twice, but they appear to be identical, I tried both. I still found that
lora_B
is all zeros:>>> all((module.weight == 0.0).all() for name, module in model.named_modules() if "lora_B.default" in name) True >>> model.base_model.model.model.layers[0].self_attn.q_proj.lora_B["default"].weight Parameter containing: tensor([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]])Could you please verify if you find the same?
System Info.
peft==0.10.0
transformers==4.37.2
Comment
I had the same issue as above. (I trained custom lora adapter model and it has lora_A and lora_B layers with non-zero weights)
I tried different way and that is using the method PeftModel.from_pretrained()
For the right above method arguments, model=my_model_path
and model_id=my_adapter_model_path
Below code is what I used.
model = AutoModelForCausalLM.from_pretrained("my_model")
lora_model = PeftModel.from_pretrained(model=model, model_id="my_adapter_model_folder_path")
I also debugged for a day, there is some mismatch while running set_peft_model_state_dict()
in src/peft/utils/save_and_load.py and I found something.
peft/src/peft/utils/save_and_load.py
Line 234 in d582b68
With this original code line, k became
base_model.model.model.layers.0.self_attn.q_proj.lora_A.default.default.weight
But I need k as
base_model.model.model.layers.0.self_attn.q_proj.lora_A.default.weight
The problem is duplicated default in k.
So, I fixed the line 234 as
k = k.replace(suffix_to_replace, f"{suffix_to_replace}")
In my case, it works and merging my custom model and my custom lora adapter model succeeded.
@BenjaminBossan I wanna know why the suffix replacing rule is needed and could it be fixed? I tried only LoRA adapter and LlamaModel as base model, so my solution can have other problems on other models.
Thanks for reading all this!
from peft.
The problem is duplicated default in k.
Yes, this is very strange and should definitely not happen. Would it be possible for you to share the code how you created and saved the adapters (training code should not be necessary here), as well as how you load the adapter? I need to see that in order to figure out how this duplication could have occurred.
from peft.
Yes, this is very strange and should definitely not happen. Would it be possible for you to share the code how you created and saved the adapters (training code should not be necessary here), as well as how you load the adapter? I need to see that in order to figure out how this duplication could have occurred.
Sure, the way I created my custom adapter is simple. Set LoraConfig with r, alpha, dropout and load a wrapped model(get_peft_model()) and then trained it with deepspeed stage3. (I followed quick tour of PEFT in huggingface webpage).
The way I load my custom adapter is this.(Above two code lines)
model = AutoModelForCausalLM.from_pretrained("my_model")
lora_model = PeftModel.from_pretrained(model=model, model_id="my_adapter_model_folder_path")
For more detail, that duplicated default happened in
peft/src/peft/utils/save_and_load.py
Lines 229 to 237 in d582b68
while adapter_name is set to "default"
I confirmed that lora layers names saved in my_adapter_model are identical in adapter_weights which is loaded when codeline below was being executed.
Line 839 in d582b68
- example of lora layer name saved in adapter and loaded from
load_peft_weights()
base_model.model.model.layers.0.self_attn.q_proj.lora_A.default.weight - example of lora layer name while for loop is being executed
base_model.model.model.layers.0.self_attn.q_proj.lora_A.default.default.weight
If my way to load adapter is wrong, please let me know. Thanks!
from peft.
Could you additionally describe how you saved the adapter? Moreover, if you can share the adapter, that would help.
from peft.
I used transformers.Trainer.train(), Trainer.save_state() and, in the end, Trainer.save_model().
Deepspeed saved several files of states in checkpoints and I used zero_to_fp32.py(which is provided from Deepspeed) to save a single adapter model as safetensors.
Sorry that I cannot share the adapter because it was trained with in-house data.
from peft.
I see, that makes sense. Since you cannot share the adapter, could you share its general structure? I.e.:
from safetensors.torch import load_file
weights = load_file("<PATH>/adapter_model.safetensors") # path to adapter weights saved by your training script
print([(k, v.shape) for k, v in weights.items()][:30]) # print 30 keys and the weight shapes
from peft.
Okay. I'll try tomorrow at the office.
By the way, does adapter_weights
in the third line is right? There is weights
in second line.
from peft.
Okay. I'll try tomorrow at the office.
Thanks a lot.
By the way, does
adapter_weights
in the third line is right? There isweights
in second line.
Yes, sorry, I made some changes to the snippet but missed that line, edited snippet should be correct now.
from peft.
Its general structures are below.
base_model.model.model.layers.0.self_attn.q_proj.lora_A.weight, torch.Size([8, 8192])
base_model.model.model.layers.0.self_attn.q_proj.lora_B.weight, torch.Size([8192, 8])
I think that I should re-train base model with LoRA config, re-convert lora adapter to safetensors, and re-load adapter and re-merge it with base model.
Maybe re-doing that process will give me something.
It would take few days. If you got a solution, would you please notice me?
from peft.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
from peft.
Related Issues (20)
- OOM error while QLoRA+Deepspeed fine tuning of Llama3-70B model on 4xA100-40GB gpus HOT 2
- Support merge_and_unload for IA3 Adapters with 4-bit and 8bit Quantization models
- model merge_and_unload do not support layer_replication HOT 5
- OOM with Phi-3-mini (3.8B) on 83.5GB RAM due to LoftQ HOT 4
- Ignore keys for modules to save HOT 1
- PeftModel failing to load after finetuning. Size Mismatch Error HOT 2
- Can peft support ColumnParallelLinear? HOT 1
- how to finetune whisper model with 'initial_prompt' HOT 3
- eval_loss showing Nan but train_loss decreases and goes to NaN after couple of steps while fine tuning gemma model with additional vocab HOT 1
- GPU Allocation Issue (QLoRa + Llama3-8B-IT) HOT 1
- Unable to Install peft==0.7.0.dev0 HOT 2
- RuntimeError: only Tensors of floating point dtype can require gradients for QLoRA since transformers 4.40 HOT 5
- TypeError: unsupported operand type(s) for *: 'Parameter' and 'NoneType' HOT 1
- Add support for OpenELM LoRA fine-tuning HOT 2
- Initialization for LoRA weights A and B initialized HOT 1
- Trainer.train() giving me Key Error: [random number] HOT 3
- Delete certain layers from PEFT model. HOT 6
- DoRA training in distributed setting
- Reproducibility when using a model with batch norm
- FSDP Dora/QDora Broken HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from peft.