System Info peft version: 0.9.0 transforemrs version: 4.37.2</

Oh, very strange. I used <a href="https://huggingface.co/spaces/safetensors/convert" r

Oh, very strange. I used <a href="https://huggingface.co/spaces/safetenso

error merge_and_unload for adapter with a prefix,about huggingface/peft

Comments (23)

BenjaminBossan commented on June 2, 2024 1

Thanks for sharing. This looks correct so far: When saving the adapter with PEFT, the adapter name is being removed from the key, so e.g. when the adapter name is "default" (which is the default), foo.layers.0.self_attn.q_proj.lora_A.default.weight would become foo.layers.0.self_attn.q_proj.lora_A.weight. I'm not 100% sure why it's removed -- probably it's so that we can load the adapter with a different adapter name later, but whatever the reason, that's what happens. In the key names you showed, there is no adapter name, so this is correct.

Later, when we load the adapter, we have to inject the adapter name back into the key, which is happening in the code snippet cited above. Looking through the code, I don't see what could go wrong for the adapter name to be injected twice, so that we end up with base_model.model.model.layers.0.self_attn.q_proj.lora_A.default.default.weight. I thought that maybe the adapter name was not properly removed in the stored adapter file, but as you've shown, that's not the case. Ideally, if you can somehow create a dummy adapter that causes this issue, without any weights trained on your data, and share it as a safetensors file, I could do further debugging.

I think that I should re-train base model with LoRA config, re-convert lora adapter to safetensors, and re-load adapter and re-merge it with base model.

If that's not too much effort for you, this could certainly be a solution. I would certainly start with very little data and ensure that this time around, loading the model works, before spending too much time training.

Alternatively, what you could try to do is to modify the PEFT code a little bit so that the double adapter name is removed. So e.g. in this line, add the following snippet:

peft_model_state_dict = {k.replace("default.default", "default"): v for k, v in peft_model_state_dict.items()}

It's very blunt, but it would be interesting to see if it solves the problem.

from peft.

shjunn commented on June 2, 2024 1

I appreciate your kind explanation and helpful solution!
I'll try and those jobs won't take so long(with small epochs and small data size for training).
I probably made a mistake on previous training. haha

I hope that I will be able to show you the reason later. Thanks!

from peft.

BenjaminBossan commented on June 2, 2024

PEFT should normally handle the prefix correctly. What kind of adapter are you using? Can you share it (ideally as a safetensors file) so that we can try to reproduce? I assume that if you load the saved model, it's not working as expected because the adapter is missing?

from peft.

afalf commented on June 2, 2024

PEFT should normally handle the prefix correctly. What kind of adapter are you using? Can you share it (ideally as a safetensors file) so that we can try to reproduce? I assume that if you load the saved model, it's not working as expected because the adapter is missing?

Yes, the saved model's weight is same with the original model. You can download my adapter from the below url, I use a lora method and save it as a adapter_model.bin.
https://drive.google.com/file/d/15tWQGR9Imrk5lKTaYRMKU70yl4_VdBNF/view?usp=drive_link

from peft.

BenjaminBossan commented on June 2, 2024

Thanks for the link. I converted the file to safetensors and then loaded it. For me, it worked correctly:

>>> model = AutoPeftModelForCausalLM.from_pretrained(<path>, device_map="cpu")
>>> print(model)
PeftModelForCausalLM(
  (base_model): LoraModel(
    (model): Qwen2ForCausalLM(
      (model): Qwen2Model(
        (embed_tokens): Embedding(151646, 2048)
        (layers): ModuleList(
          (0-23): 24 x Qwen2DecoderLayer(
            (self_attn): Qwen2SdpaAttention(
              (q_proj): lora.Linear(
                (base_layer): Linear(in_features=2048, out_features=2048, bias=True)
                (lora_dropout): ModuleDict(
                  (default): Dropout(p=0.1, inplace=False)
                )
                (lora_A): ModuleDict(
                  (default): Linear(in_features=2048, out_features=16, bias=False)  # <= LoRA adapter
                )
                (lora_B): ModuleDict(
                  (default): Linear(in_features=16, out_features=2048, bias=False)  # <= LoRA adapter
                )
                (lora_embedding_A): ParameterDict()
                (lora_embedding_B): ParameterDict()
              )
etc.
>>> len([m for m in model.modules() if isinstance(m, peft.tuners.lora.LoraLayer)])
168

How exactly did you determine that the model was the same as previously?

from peft.

afalf commented on June 2, 2024

Thanks for the link. I converted the file to safetensors and then loaded it. For me, it worked correctly:

>>> model = AutoPeftModelForCausalLM.from_pretrained(<path>, device_map="cpu")
>>> print(model)
PeftModelForCausalLM(
  (base_model): LoraModel(
    (model): Qwen2ForCausalLM(
      (model): Qwen2Model(
        (embed_tokens): Embedding(151646, 2048)
        (layers): ModuleList(
          (0-23): 24 x Qwen2DecoderLayer(
            (self_attn): Qwen2SdpaAttention(
              (q_proj): lora.Linear(
                (base_layer): Linear(in_features=2048, out_features=2048, bias=True)
                (lora_dropout): ModuleDict(
                  (default): Dropout(p=0.1, inplace=False)
                )
                (lora_A): ModuleDict(
                  (default): Linear(in_features=2048, out_features=16, bias=False)  # <= LoRA adapter
                )
                (lora_B): ModuleDict(
                  (default): Linear(in_features=16, out_features=2048, bias=False)  # <= LoRA adapter
                )
                (lora_embedding_A): ParameterDict()
                (lora_embedding_B): ParameterDict()
              )
etc.
>>> len([m for m in model.modules() if isinstance(m, peft.tuners.lora.LoraLayer)])
168

How exactly did you determine that the model was the same as previously?

Yes, this adapter can be correctly loaded, but after I run "model.merge_and_unload()", the merged model's weights is same as previously. I check it by print the values of each layer's weight.

from peft.

BenjaminBossan commented on June 2, 2024

Yes, the reason appears to be that the LoRA adapter was not trained. All the lora_B weights are 0, therefore LoRA is a no-op and doesn't change the weights:

>>> all((module.weight == 0.0).all() for name, module in model.named_modules() if "lora_B.default" in name)
True

You should train the LoRA adapter correctly first.

from peft.

afalf commented on June 2, 2024

Yes, the reason appears to be that the LoRA adapter was not trained. All the lora_B weights are 0, therefore LoRA is a no-op and doesn't change the weights:
>>> all((module.weight == 0.0).all() for name, module in model.named_modules() if "lora_B.default" in name)
True
You should train the LoRA adapter correctly first.

Emmm, but the lora adapter has been trained and I have checked the weights is not zero in the adapter.bin file, is there some error in converting it to safetensor?

from peft.

BenjaminBossan commented on June 2, 2024

Oh, very strange. I used this space to convert it but maybe I did something wrong. Could you please upload a safetensors file, so that I can check it out? If you do save_pretrained with a recent version of PEFT, it should default to safetensors automatically.

from peft.

afalf commented on June 2, 2024

Oh, very strange. I used this space to convert it but maybe I did something wrong. Could you please upload a safetensors file, so that I can check it out? If you do save_pretrained with a recent version of PEFT, it should default to safetensors automatically.

Here:
https://drive.google.com/file/d/1CQI7UCV-zBTNBK4asHl-UUR1Lqz72JAd/view?usp=sharing

from peft.

BenjaminBossan commented on June 2, 2024

Thanks for uploading a safetensors version. Your zip file seems to contain the same checkpoint twice, but they appear to be identical, I tried both. I still found that lora_B is all zeros:

>>> all((module.weight == 0.0).all() for name, module in model.named_modules() if "lora_B.default" in name)
True
>>> model.base_model.model.model.layers[0].self_attn.q_proj.lora_B["default"].weight
Parameter containing:
tensor([[0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        ...,
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.]])

Could you please verify if you find the same?

from peft.

afalf commented on June 2, 2024

Thanks for uploading a safetensors version. Your zip file seems to contain the same checkpoint twice, but they appear to be identical, I tried both. I still found that lora_B is all zeros:

>>> all((module.weight == 0.0).all() for name, module in model.named_modules() if "lora_B.default" in name)
True
>>> model.base_model.model.model.layers[0].self_attn.q_proj.lora_B["default"].weight
Parameter containing:
tensor([[0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        ...,
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.]])

Could you please verify if you find the same?

I try to use safetensors to load the adapter_model.safetensors, and find it is not all zeros. Here is my code:

from safetensors import safe_open
with safe_open('adapter_model.safetensors',framework="pt", device='cpu') as f:
    tensors = {}
    for k in f.keys():
        tensors[k] = f.get_tensor(k)
print(tensors['base_model.model.layers.0.self_attn.q_proj.lora_B.weight'])

tensor([[-0.0747, -0.0737, -0.0747,  ...,  0.0762, -0.0742, -0.0737],
        [-0.0391, -0.0400, -0.0400,  ...,  0.0415, -0.0396, -0.0371],
        [-0.0544, -0.0535, -0.0557,  ...,  0.0571, -0.0549, -0.0520],
        ...,
        [-0.0034, -0.0036, -0.0023,  ...,  0.0069, -0.0031, -0.0054],
        [-0.0042, -0.0039, -0.0055,  ...,  0.0023, -0.0058, -0.0034],
        [-0.0114, -0.0098, -0.0100,  ...,  0.0110, -0.0115, -0.0120]],
       dtype=torch.bfloat16)

Besides that, I have confirmed that this peft adapter can be accurately utilized in downstream task when it is loaded using the AutoModel.load_adapter() method. However, an issue arises when attempting to use it with the merged model. So I believe that the error is from the merging process.

from peft.

BenjaminBossan commented on June 2, 2024

Okay, so the reason seems to be that there's a mismatch between the keys found in the adapter and the keys expected by the model. When I jump into this line and check the keys, I can see that they all mismatch:

>>> keys_found = sorted(adapters_weights.keys())
>>> keys_expected = sorted(self.state_dict())
>>> s0 = set(keys_expected)
>>> s1 = set(keys_found)
>>> len(s0), len(s1)
(627, 448)
>>> len(s0 - s1)
627
>>> len(s1 - s0)
448
>>> pp keys_found[:10]
['base_model.model.layers.0.mlp.down_proj.lora_A.weight',
 'base_model.model.layers.0.mlp.down_proj.lora_B.weight',
 'base_model.model.layers.0.mlp.gate_proj.lora_A.weight',
 'base_model.model.layers.0.mlp.gate_proj.lora_B.weight',
 'base_model.model.layers.0.mlp.up_proj.lora_A.weight',
 'base_model.model.layers.0.mlp.up_proj.lora_B.weight',
 'base_model.model.layers.0.self_attn.k_proj.lora_A.weight',
 'base_model.model.layers.0.self_attn.k_proj.lora_B.weight',
 'base_model.model.layers.0.self_attn.o_proj.lora_A.weight',
 'base_model.model.layers.0.self_attn.o_proj.lora_B.weight']
>>> pp keys_expected[:10]
['base_model.model.lm_head.weight',
 'base_model.model.model.embed_tokens.weight',
 'base_model.model.model.layers.0.input_layernorm.weight',
 'base_model.model.model.layers.0.mlp.down_proj.base_layer.weight',
 'base_model.model.model.layers.0.mlp.down_proj.lora_A.default.weight',
 'base_model.model.model.layers.0.mlp.down_proj.lora_B.default.weight',
 'base_model.model.model.layers.0.mlp.gate_proj.base_layer.weight',
 'base_model.model.model.layers.0.mlp.gate_proj.lora_A.default.weight',
 'base_model.model.model.layers.0.mlp.gate_proj.lora_B.default.weight',
 'base_model.model.model.layers.0.mlp.up_proj.base_layer.weight']

When creating a fresh model, I can confirm that the latter is the correct format for this model. My adapter_model.safetensors also only contained 336 entries, not 448 as in your file.

I'm not sure what exactly happened that causes the adapter you have to use a different format, maybe there was a change in the version of PEFT or transformers between creating the adapter and loading it?

To ensure that there is no bug in PEFT, I confirmed that it's possible to save and load an adapter with qwen:

>>> from transformers import AutoModelForCausalLM
>>> from peft import get_peft_model, LoraConfig, PeftModel

>>> base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen1.5-1.8B")
>>> peft_model = get_peft_model(base_model, LoraConfig(target_modules=["up_proj", "q_proj", "down_proj", "k_proj", "o_proj", "gate_proj", "v_proj"], init_lora_weights=False))

>>> peft_model.base_model.model.model.layers[0].self_attn.q_proj.lora_A["default"].weight
Parameter containing:
tensor([[ 0.0061,  0.0040, -0.0037,  ...,  0.0030, -0.0103, -0.0040],
        [-0.0190,  0.0183,  0.0137,  ..., -0.0065,  0.0063, -0.0156],
        [ 0.0170,  0.0203,  0.0184,  ..., -0.0191,  0.0132, -0.0176],
        ...,
        [-0.0091,  0.0197, -0.0063,  ..., -0.0170, -0.0003,  0.0013],
        [ 0.0135,  0.0209, -0.0040,  ..., -0.0119,  0.0159,  0.0164],
        [ 0.0003,  0.0220, -0.0092,  ...,  0.0070,  0.0012,  0.0212]],
       requires_grad=True)

>>> peft_model.base_model.model.model.layers[0].self_attn.q_proj.lora_B["default"].weight
Parameter containing:
tensor([[-0.3462, -0.1964,  0.0248,  ...,  0.1943, -0.1583, -0.2640],
        [-0.0430,  0.3114,  0.1676,  ..., -0.0210,  0.1741,  0.2173],
        [ 0.0789,  0.2819, -0.1108,  ..., -0.1683,  0.1381, -0.3278],
        ...,
        [ 0.1441, -0.0852,  0.2126,  ..., -0.0384, -0.1946,  0.3313],
        [-0.2722,  0.2995,  0.2065,  ...,  0.0393, -0.2830,  0.3083],
        [ 0.0508,  0.2045,  0.0730,  ...,  0.1732,  0.3274,  0.0733]],
       requires_grad=True)

>>> peft_model.save_pretrained("/tmp/peft/qwen")
>>> del base_model
>>> del peft_model

>>> base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen1.5-1.8B")
>>> peft_model = PeftModel.from_pretrained(base_model, "/tmp/peft/qwen")
>>> peft_model.base_model.model.model.layers[0].self_attn.q_proj.lora_A["default"].weight
Parameter containing:
tensor([[ 0.0061,  0.0040, -0.0037,  ...,  0.0030, -0.0103, -0.0040],
        [-0.0190,  0.0183,  0.0137,  ..., -0.0065,  0.0063, -0.0156],
        [ 0.0170,  0.0203,  0.0184,  ..., -0.0191,  0.0132, -0.0176],
        ...,
        [-0.0091,  0.0197, -0.0063,  ..., -0.0170, -0.0003,  0.0013],
        [ 0.0135,  0.0209, -0.0040,  ..., -0.0119,  0.0159,  0.0164],
        [ 0.0003,  0.0220, -0.0092,  ...,  0.0070,  0.0012,  0.0212]])

>>> peft_model.base_model.model.model.layers[0].self_attn.q_proj.lora_B["default"].weight
Parameter containing:
tensor([[-0.3462, -0.1964,  0.0248,  ...,  0.1943, -0.1583, -0.2640],
        [-0.0430,  0.3114,  0.1676,  ..., -0.0210,  0.1741,  0.2173],
        [ 0.0789,  0.2819, -0.1108,  ..., -0.1683,  0.1381, -0.3278],
        ...,
        [ 0.1441, -0.0852,  0.2126,  ..., -0.0384, -0.1946,  0.3313],
        [-0.2722,  0.2995,  0.2065,  ...,  0.0393, -0.2830,  0.3083],
        [ 0.0508,  0.2045,  0.0730,  ...,  0.1732,  0.3274,  0.0733]])

It would probably be possibly to salvage this adapter by remapping the keys from its state_dict to what's actually expected.

from peft.

shjunn commented on June 2, 2024

Thanks for uploading a safetensors version. Your zip file seems to contain the same checkpoint twice, but they appear to be identical, I tried both. I still found that lora_B is all zeros:

>>> all((module.weight == 0.0).all() for name, module in model.named_modules() if "lora_B.default" in name)
True
>>> model.base_model.model.model.layers[0].self_attn.q_proj.lora_B["default"].weight
Parameter containing:
tensor([[0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        ...,
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.]])

Could you please verify if you find the same?

System Info.
peft==0.10.0
transformers==4.37.2

Comment
I had the same issue as above. (I trained custom lora adapter model and it has lora_A and lora_B layers with non-zero weights)
I tried different way and that is using the method PeftModel.from_pretrained()
For the right above method arguments, model=my_model_path and model_id=my_adapter_model_path

Below code is what I used.

model = AutoModelForCausalLM.from_pretrained("my_model")
lora_model = PeftModel.from_pretrained(model=model, model_id="my_adapter_model_folder_path")

I also debugged for a day, there is some mismatch while running set_peft_model_state_dict() in src/peft/utils/save_and_load.py and I found something.

peft/src/peft/utils/save_and_load.py

Line 234 in d582b68

k = k.replace(suffix_to_replace, f"{adapter_name}.{suffix_to_replace}")

With this original code line, k became base_model.model.model.layers.0.self_attn.q_proj.lora_A.default.default.weight
But I need k as base_model.model.model.layers.0.self_attn.q_proj.lora_A.default.weight
The problem is duplicated default in k.
So, I fixed the line 234 as k = k.replace(suffix_to_replace, f"{suffix_to_replace}")
In my case, it works and merging my custom model and my custom lora adapter model succeeded.
@BenjaminBossan I wanna know why the suffix replacing rule is needed and could it be fixed? I tried only LoRA adapter and LlamaModel as base model, so my solution can have other problems on other models.

Thanks for reading all this!

from peft.

BenjaminBossan commented on June 2, 2024

The problem is duplicated default in k.

Yes, this is very strange and should definitely not happen. Would it be possible for you to share the code how you created and saved the adapters (training code should not be necessary here), as well as how you load the adapter? I need to see that in order to figure out how this duplication could have occurred.

from peft.

shjunn commented on June 2, 2024

Yes, this is very strange and should definitely not happen. Would it be possible for you to share the code how you created and saved the adapters (training code should not be necessary here), as well as how you load the adapter? I need to see that in order to figure out how this duplication could have occurred.

Sure, the way I created my custom adapter is simple. Set LoraConfig with r, alpha, dropout and load a wrapped model(get_peft_model()) and then trained it with deepspeed stage3. (I followed quick tour of PEFT in huggingface webpage).
The way I load my custom adapter is this.(Above two code lines)

model = AutoModelForCausalLM.from_pretrained("my_model")
lora_model = PeftModel.from_pretrained(model=model, model_id="my_adapter_model_folder_path")

For more detail, that duplicated default happened in

peft/src/peft/utils/save_and_load.py

Lines 229 to 237 in d582b68

 for k, v in state_dict.items(): 

 if parameter_prefix in k: 

 suffix = k.split(parameter_prefix)[1] 

 if "." in suffix: 

 suffix_to_replace = ".".join(suffix.split(".")[1:]) 

 k = k.replace(suffix_to_replace, f"{adapter_name}.{suffix_to_replace}") 

 else: 

 k = f"{k}.{adapter_name}" 

 peft_model_state_dict[k] = v

while adapter_name is set to "default"

I confirmed that lora layers names saved in my_adapter_model are identical in adapter_weights which is loaded when codeline below was being executed.

peft/src/peft/peft_model.py

Line 839 in d582b68

 adapters_weights = load_peft_weights(model_id, device=torch_device, **hf_hub_download_kwargs) 

example of lora layer name saved in adapter and loaded from load_peft_weights()
base_model.model.model.layers.0.self_attn.q_proj.lora_A.default.weight
example of lora layer name while for loop is being executed
base_model.model.model.layers.0.self_attn.q_proj.lora_A.default.default.weight

If my way to load adapter is wrong, please let me know. Thanks!

from peft.

BenjaminBossan commented on June 2, 2024

Could you additionally describe how you saved the adapter? Moreover, if you can share the adapter, that would help.

from peft.

shjunn commented on June 2, 2024

I used transformers.Trainer.train(), Trainer.save_state() and, in the end, Trainer.save_model().
Deepspeed saved several files of states in checkpoints and I used zero_to_fp32.py(which is provided from Deepspeed) to save a single adapter model as safetensors.

Sorry that I cannot share the adapter because it was trained with in-house data.

from peft.

BenjaminBossan commented on June 2, 2024

I see, that makes sense. Since you cannot share the adapter, could you share its general structure? I.e.:

from safetensors.torch import load_file
weights = load_file("<PATH>/adapter_model.safetensors")  # path to adapter weights saved by your training script
print([(k, v.shape) for k, v in weights.items()][:30])  # print 30 keys and the weight shapes

from peft.

shjunn commented on June 2, 2024

Okay. I'll try tomorrow at the office.
By the way, does adapter_weights in the third line is right? There is weights in second line.

from peft.

BenjaminBossan commented on June 2, 2024

Okay. I'll try tomorrow at the office.

Thanks a lot.

By the way, does adapter_weights in the third line is right? There is weights in second line.

Yes, sorry, I made some changes to the snippet but missed that line, edited snippet should be correct now.

from peft.

shjunn commented on June 2, 2024

Its general structures are below.

base_model.model.model.layers.0.self_attn.q_proj.lora_A.weight, torch.Size([8, 8192])
base_model.model.model.layers.0.self_attn.q_proj.lora_B.weight, torch.Size([8192, 8])

I think that I should re-train base model with LoRA config, re-convert lora adapter to safetensors, and re-load adapter and re-merge it with base model.
Maybe re-doing that process will give me something.
It would take few days. If you got a solution, would you please notice me?

from peft.

github-actions commented on June 2, 2024

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

from peft.

error merge_and_unload for adapter with a prefix about peft HOT 23 CLOSED

Comments (23)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	for k, v in state_dict.items():
	if parameter_prefix in k:
	suffix = k.split(parameter_prefix)[1]
	if "." in suffix:
	suffix_to_replace = ".".join(suffix.split(".")[1:])
	k = k.replace(suffix_to_replace, f"{adapter_name}.{suffix_to_replace}")
	else:
	k = f"{k}.{adapter_name}"
	peft_model_state_dict[k] = v