If we do <div class="highlight highlight-source-python notranslate position-relati

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Sorry if I wasn't clear. In the above issue <code class="notranslate

Number of trainable parameters for a LoRA model w.r.t the original model about peft HOT 7 CLOSED

sayakpaul commented on August 27, 2024

Number of trainable parameters for a LoRA model w.r.t the original model

from peft.

Comments (7)

pacman100 commented on August 27, 2024 7

Hello LoRA Model adds modules to original model in place. Please use below function for knowing the number of trainable param percentage:

def print_trainable_parameters(model):
    """
    Prints the number of trainable parameters in the model.
    """
    trainable_params = 0
    all_param = 0
    for _, param in model.named_parameters():
        all_param += param.numel()
        if param.requires_grad:
            trainable_params += param.numel()
    print(
        f"trainable params: {trainable_params} || all params: {all_param} || trainable%: {100 * trainable_params / all_param}"
    )

from peft.

sayakpaul commented on August 27, 2024

Thanks!

from peft.

sayakpaul commented on August 27, 2024

@pacman100

If we do print_trainable_parameters(model) and print_trainable_parameters(lora_model), it prints the same numbers:

trainable params: 667493 || all params: 86466149 || trainable%: 0.7719703117574949
trainable params: 667493 || all params: 86466149 || trainable%: 0.7719703117574949

Reference Colab: https://colab.research.google.com/drive/1GtKXiVyINz2rRnd5FSMh2WnTjt7nOTEn?usp=sharing#scrollTo=-4zp34QGjEoi

I know the LoRAModel adds modules in place but this situation might confuse users that are interested in investigating parameter efficiency gains from LoRA in terms of the trainable parameters. Could we provide some kind of a wrapper that would remove the in-place modifications?

from peft.

pacman100 commented on August 27, 2024

Hello, the number of trainable parameters is just 0.77% of the total number of params in the model, which should indicate the parameter efficiency in fine-tuning. I am confused about the issue here. In the usual case for causal language modelling or seq2seq, only Lora params are trainable and all the base model params are frozen; in contrast to the case of full-finetuning wherein all the base model params are trainable.

from peft.

sayakpaul commented on August 27, 2024

Sorry if I wasn't clear.

In the above issue model is the original model and lora_model is the LoRA modified version of model. My issue is that the total number of trainable parameters of the original model should be as is and that is not the case here (probably because of the in-place modifications introduced by PEFT).

Let's illustrate what I am saying with an example:

from transformers import AutoModelForImageClassification

model_checkpoint = "google/vit-base-patch32-224-in21k"
label2id = {"dog": 0, "cat": 1, "mouse": 2}
id2label = {v: k for k, v in label2id.items()}
model = AutoModelForImageClassification.from_pretrained(
    model_checkpoint, 
    label2id=label2id,
    id2label=id2label,
    ignore_mismatched_sizes=True, 
)


def print_trainable_parameters(model):
    """
    Prints the number of trainable parameters in the model.
    """
    trainable_params = 0
    all_param = 0
    for _, param in model.named_parameters():
        all_param += param.numel()
        if param.requires_grad:
            trainable_params += param.numel()
    print(
        f"trainable params: {trainable_params} || all params: {all_param} || trainable%: {100 * trainable_params / all_param}"
    )


print_trainable_parameters(model)

As expected, it prints

trainable params: 87457539 || all params: 87457539 || trainable%: 100.0

After lora modifications are added we see the reduction in the number of total trainable parameters.

lora_model = LoraModel(config, model)
print_trainable_parameters(lora_model)

However, now when I call print_trainable_parameters(model) (notice that it's model and NOT lora_model) I get what I got when calling print_trainable_parameters(lora_model).

While that is expected since the LoRA modifications are done din-place as per #41 (comment), this might be confusing for users as I mentioned in #41 (comment).

Let me know what I am trying to convey is still unclear.

from peft.

pacman100 commented on August 27, 2024

Hello, LoRA being modifying model inplace makes it widely applicable. On the other hand, I don't think this could be an issue if you compare the trainable params before and after creating PEFt model as shown below:

from peft.

sayakpaul commented on August 27, 2024

Cool cool! Just want to ensure that the users are aware of the in-place modification :)

from peft.

Number of trainable parameters for a LoRA model w.r.t the original model about peft HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent