Git Product home page Git Product logo

Comments (9)

BenjaminBossan avatar BenjaminBossan commented on June 10, 2024

We had a PR once in #980 but there were a few non-trivial decisions to be made. If you want to work on a PR, please check out the discussion there. Pinging @alexrs also to see if there are any updates.

from peft.

Abdullah-kwl avatar Abdullah-kwl commented on June 10, 2024

I am using your branch alexrs:multi-ia3 to test the addd_weighted_adapter for ia3

FIRST_ADAPTER_PATH = "/content/drive/MyDrive/FedMl_test_llm/TrainedModels/WestLake_SFT_IA3_1/Weights/Epoch_1/Job_1"
SECOND_ADAPTER_PATH = "/content/drive/MyDrive/FedMl_test_llm/TrainedModels/WestLake_SFT_IA3_2/Weights/Epoch_1/Job_1"

FIRST_ADAPTER_NAME = "first"
SECOND_ADAPTER_NAME = "second"

model = PeftModel.from_pretrained(quantized_model_4bit, FIRST_ADAPTER_PATH, FIRST_ADAPTER_NAME)
_ = model.load_adapter(SECOND_ADAPTER_PATH, adapter_name=SECOND_ADAPTER_NAME)

adapters_li = [FIRST_ADAPTER_NAME,SECOND_ADAPTER_NAME]
weights_li = [0.5,0.5]

new_adapter= "ADAPTER_WEIGHTED"

if new_adapter in model.peft_config:
model.delete_adapter(new_adapter)

model.add_weighted_adapter(adapters=adapters_li, weights=weights_li, adapter_name=new_adapter)

After this it shows me the error that :
Invalid type <class 'list'> found in target_modules

I meanulay test this using code
loaded_adapters = list(model.peft_config.keys())
print(loaded_adapters)
shows I have two adapters load in model
['first', 'second']

I manually test this:
module_type="target_modules"
adapters=adapters_li
module_types = [type(getattr(model.peft_config[adapter], module_type)) for adapter in adapters]

print(module_types ) shows [list, list]

Screenshot 2024-04-30 181642

Please @alexrs and @BenjaminBossan solve my problem I urgently need to merge few ia3 adapters , how can I add them

from peft.

alexrs avatar alexrs commented on June 10, 2024

Thanks for the ping @BenjaminBossan

Hi @Abdullah-kwl, the code I wrote for #980 is very outdated. The HF folks have been shipping lots of code lately! 👏

I guess this is a good moment to start the conversation again (sorry for the very long delay!). We want to implement add_weighted_adapter for $(IA)^3$ adapters in such a way that is comparable to the LoRA implementation. The implementation, however, is a bit different for multiple reasons:

  • $(IA)^3$ introduces trainable vectors instead of matrices, therefore it does not make sense to support most of the combination types implemented for LoRA.
  • $(IA)^3$ follows multiplicative operators. When we discussed this issue back in November last year, we did not reach an agreement on whether we should do a linear combination of adapters, or multiply them.

I'd suggest that we can start with a simple implementation of add_weighted_adapter that combines adapters using a weighted average of $(IA)^3$ vectors (equivalent to linear in LoRA if I remember correctly).

Thoughts @BenjaminBossan @pacman100 ?

I can try to prototype something in the next few days if you think this is a good approach!

from peft.

Abdullah-kwl avatar Abdullah-kwl commented on June 10, 2024

It is good to start with a simpler linear approach.

from peft.

BenjaminBossan avatar BenjaminBossan commented on June 10, 2024

I'd suggest that we can start with a simple implementation of add_weighted_adapter

That would be fantastic. Let's start with something simple and not try to have a "feature complete" copy of add_weighted_adapter for LoRA.

I can try to prototype something in the next few days if you think this is a good approach!

Thanks, that would be great. Maybe once you have the first testable version, @Abdullah-kwl can test it out and give feedback on whether it works well or not.

from peft.

alexrs avatar alexrs commented on June 10, 2024

Hi @Abdullah-kwl

I found some time to prototype an implementation. You can find it in #1701

It is still work in progress, I did not do any manual testing to check that the result is correct. You can give it a try and report any issues!

Assuming you have two $(IA)^3$ adapters, you should be able to use:

peft_model.add_weighted_adapter(ia3_adapters, [0.1, 0.9], "weighted_adapter")

to combine them using a weighted average of the adapters.

from peft.

Abdullah-kwl avatar Abdullah-kwl commented on June 10, 2024

I have tested this, it is working

Screenshot 2024-05-02 171231
Screenshot 2024-05-02 171255

it is working now, the next step could be for svd like strategies

from peft.

Abdullah-kwl avatar Abdullah-kwl commented on June 10, 2024

i am also facing some issue relate ia3 that, Cannot merge ia3 layers when the model is loaded in 4-bit

mention in this #1704
image

Screenshot 2024-05-02 153000

@BenjaminBossan please also look at this if we can also add this feature (merge_and_unload for ia3 adapters) for both 4bit and 8bit quantized models .

from peft.

BenjaminBossan avatar BenjaminBossan commented on June 10, 2024

I have tested this, it is working

Thanks for giving this a spin. If you have any numbers to share, like scores before and after merging, or even code, that would be great.

i am also facing some issue relate ia3 that, Cannot merge ia3 layers when the model is loaded in 4-bit

Indeed, this is not yet supported. We will certainly take a look at this at some point, but contributions are also very welcome. (And please don't post the same issue twice)

from peft.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.