Comments (9)
We had a PR once in #980 but there were a few non-trivial decisions to be made. If you want to work on a PR, please check out the discussion there. Pinging @alexrs also to see if there are any updates.
from peft.
I am using your branch alexrs:multi-ia3 to test the addd_weighted_adapter for ia3
FIRST_ADAPTER_PATH = "/content/drive/MyDrive/FedMl_test_llm/TrainedModels/WestLake_SFT_IA3_1/Weights/Epoch_1/Job_1"
SECOND_ADAPTER_PATH = "/content/drive/MyDrive/FedMl_test_llm/TrainedModels/WestLake_SFT_IA3_2/Weights/Epoch_1/Job_1"
FIRST_ADAPTER_NAME = "first"
SECOND_ADAPTER_NAME = "second"
model = PeftModel.from_pretrained(quantized_model_4bit, FIRST_ADAPTER_PATH, FIRST_ADAPTER_NAME)
_ = model.load_adapter(SECOND_ADAPTER_PATH, adapter_name=SECOND_ADAPTER_NAME)
adapters_li = [FIRST_ADAPTER_NAME,SECOND_ADAPTER_NAME]
weights_li = [0.5,0.5]
new_adapter= "ADAPTER_WEIGHTED"
if new_adapter in model.peft_config:
model.delete_adapter(new_adapter)
model.add_weighted_adapter(adapters=adapters_li, weights=weights_li, adapter_name=new_adapter)
After this it shows me the error that :
Invalid type <class 'list'> found in target_modules
I meanulay test this using code
loaded_adapters = list(model.peft_config.keys())
print(loaded_adapters)
shows I have two adapters load in model
['first', 'second']
I manually test this:
module_type="target_modules"
adapters=adapters_li
module_types = [type(getattr(model.peft_config[adapter], module_type)) for adapter in adapters]
print(module_types ) shows [list, list]
Please @alexrs and @BenjaminBossan solve my problem I urgently need to merge few ia3 adapters , how can I add them
from peft.
Thanks for the ping @BenjaminBossan
Hi @Abdullah-kwl, the code I wrote for #980 is very outdated. The HF folks have been shipping lots of code lately! 👏
I guess this is a good moment to start the conversation again (sorry for the very long delay!). We want to implement add_weighted_adapter
for
-
$(IA)^3$ introduces trainable vectors instead of matrices, therefore it does not make sense to support most of the combination types implemented for LoRA. -
$(IA)^3$ follows multiplicative operators. When we discussed this issue back in November last year, we did not reach an agreement on whether we should do a linear combination of adapters, or multiply them.
I'd suggest that we can start with a simple implementation of add_weighted_adapter
that combines adapters using a weighted average of linear
in LoRA if I remember correctly).
Thoughts @BenjaminBossan @pacman100 ?
I can try to prototype something in the next few days if you think this is a good approach!
from peft.
It is good to start with a simpler linear approach.
from peft.
I'd suggest that we can start with a simple implementation of
add_weighted_adapter
That would be fantastic. Let's start with something simple and not try to have a "feature complete" copy of add_weighted_adapter
for LoRA.
I can try to prototype something in the next few days if you think this is a good approach!
Thanks, that would be great. Maybe once you have the first testable version, @Abdullah-kwl can test it out and give feedback on whether it works well or not.
from peft.
I found some time to prototype an implementation. You can find it in #1701
It is still work in progress, I did not do any manual testing to check that the result is correct. You can give it a try and report any issues!
Assuming you have two
peft_model.add_weighted_adapter(ia3_adapters, [0.1, 0.9], "weighted_adapter")
to combine them using a weighted average of the adapters.
from peft.
I have tested this, it is working
it is working now, the next step could be for svd like strategies
from peft.
i am also facing some issue relate ia3 that, Cannot merge ia3 layers when the model is loaded in 4-bit
mention in this #1704
@BenjaminBossan please also look at this if we can also add this feature (merge_and_unload for ia3 adapters) for both 4bit and 8bit quantized models .
from peft.
I have tested this, it is working
Thanks for giving this a spin. If you have any numbers to share, like scores before and after merging, or even code, that would be great.
i am also facing some issue relate ia3 that, Cannot merge ia3 layers when the model is loaded in 4-bit
Indeed, this is not yet supported. We will certainly take a look at this at some point, but contributions are also very welcome. (And please don't post the same issue twice)
from peft.
Related Issues (20)
- PeftModel failing to load after finetuning. Size Mismatch Error HOT 2
- Can peft support ColumnParallelLinear? HOT 2
- how to finetune whisper model with 'initial_prompt' HOT 3
- eval_loss showing Nan but train_loss decreases and goes to NaN after couple of steps while fine tuning gemma model with additional vocab HOT 1
- GPU Allocation Issue (QLoRa + Llama3-8B-IT) HOT 1
- Unable to Install peft==0.7.0.dev0 HOT 2
- RuntimeError: only Tensors of floating point dtype can require gradients for QLoRA since transformers 4.40 HOT 5
- TypeError: unsupported operand type(s) for *: 'Parameter' and 'NoneType' HOT 1
- Add support for OpenELM LoRA fine-tuning HOT 2
- Initialization for LoRA weights A and B initialized HOT 1
- Trainer.train() giving me Key Error: [random number] HOT 3
- Delete certain layers from PEFT model. HOT 6
- DoRA training in distributed setting
- Reproducibility when using a model with batch norm
- FSDP Dora/QDora Broken HOT 4
- CUDA kernels from PEFT v0.11.0 breaks C++ compilation HOT 4
- Adapter Merge for Idefics2 HOT 2
- `AdaLoRA` is incompatible with `gradient checkpointing` when using `torchrun` HOT 2
- LoRA adaptation shape mismatch HOT 7
- cannot import name 'get_peft_config' from 'peft' (unknown location) HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from peft.