Git Product home page Git Product logo

Comments (7)

pacman100 avatar pacman100 commented on August 27, 2024 7

Hello LoRA Model adds modules to original model in place. Please use below function for knowing the number of trainable param percentage:

def print_trainable_parameters(model):
    """
    Prints the number of trainable parameters in the model.
    """
    trainable_params = 0
    all_param = 0
    for _, param in model.named_parameters():
        all_param += param.numel()
        if param.requires_grad:
            trainable_params += param.numel()
    print(
        f"trainable params: {trainable_params} || all params: {all_param} || trainable%: {100 * trainable_params / all_param}"
    )

from peft.

sayakpaul avatar sayakpaul commented on August 27, 2024

Thanks!

from peft.

sayakpaul avatar sayakpaul commented on August 27, 2024

@pacman100

If we do print_trainable_parameters(model) and print_trainable_parameters(lora_model), it prints the same numbers:

trainable params: 667493 || all params: 86466149 || trainable%: 0.7719703117574949
trainable params: 667493 || all params: 86466149 || trainable%: 0.7719703117574949

Reference Colab: https://colab.research.google.com/drive/1GtKXiVyINz2rRnd5FSMh2WnTjt7nOTEn?usp=sharing#scrollTo=-4zp34QGjEoi

I know the LoRAModel adds modules in place but this situation might confuse users that are interested in investigating parameter efficiency gains from LoRA in terms of the trainable parameters. Could we provide some kind of a wrapper that would remove the in-place modifications?

from peft.

pacman100 avatar pacman100 commented on August 27, 2024

Hello, the number of trainable parameters is just 0.77% of the total number of params in the model, which should indicate the parameter efficiency in fine-tuning. I am confused about the issue here. In the usual case for causal language modelling or seq2seq, only Lora params are trainable and all the base model params are frozen; in contrast to the case of full-finetuning wherein all the base model params are trainable.

from peft.

sayakpaul avatar sayakpaul commented on August 27, 2024

Sorry if I wasn't clear.

In the above issue model is the original model and lora_model is the LoRA modified version of model. My issue is that the total number of trainable parameters of the original model should be as is and that is not the case here (probably because of the in-place modifications introduced by PEFT).

Let's illustrate what I am saying with an example:

from transformers import AutoModelForImageClassification

model_checkpoint = "google/vit-base-patch32-224-in21k"
label2id = {"dog": 0, "cat": 1, "mouse": 2}
id2label = {v: k for k, v in label2id.items()}
model = AutoModelForImageClassification.from_pretrained(
    model_checkpoint, 
    label2id=label2id,
    id2label=id2label,
    ignore_mismatched_sizes=True, 
)


def print_trainable_parameters(model):
    """
    Prints the number of trainable parameters in the model.
    """
    trainable_params = 0
    all_param = 0
    for _, param in model.named_parameters():
        all_param += param.numel()
        if param.requires_grad:
            trainable_params += param.numel()
    print(
        f"trainable params: {trainable_params} || all params: {all_param} || trainable%: {100 * trainable_params / all_param}"
    )


print_trainable_parameters(model)

As expected, it prints

trainable params: 87457539 || all params: 87457539 || trainable%: 100.0

After lora modifications are added we see the reduction in the number of total trainable parameters.

lora_model = LoraModel(config, model)
print_trainable_parameters(lora_model)

However, now when I call print_trainable_parameters(model) (notice that it's model and NOT lora_model) I get what I got when calling print_trainable_parameters(lora_model).

While that is expected since the LoRA modifications are done din-place as per #41 (comment), this might be confusing for users as I mentioned in #41 (comment).

Let me know what I am trying to convey is still unclear.

from peft.

pacman100 avatar pacman100 commented on August 27, 2024

Hello, LoRA being modifying model inplace makes it widely applicable. On the other hand, I don't think this could be an issue if you compare the trainable params before and after creating PEFt model as shown below:

Screenshot 2023-02-01 at 3 48 57 PM

from peft.

sayakpaul avatar sayakpaul commented on August 27, 2024

Cool cool! Just want to ensure that the users are aware of the in-place modification :)

from peft.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.