Comments (5)
Yes! Sheared-LLaMA models are standard huggingface models loaded with Llama model files. You can load the sheared-llama models in huggingface the same as you load standard huggingface models.
from transformers import LlamaForCausalLM
model = LlamaForCausalLM.from_pretrained("princeton-nlp/Sheared-LLaMA-1.3B")
and you can use the peft
module to wrap it to be a LoRA model.
from llm-shearing.
Ah thank you so much! What kind of computer specs do you use to load and train these models? I'm using 2 A100s with 80Gb memory each but keep running into cgroup out of memory errors
from llm-shearing.
2A100 should be more than sufficient to train 1.3B and 2.7B LoRA models!
from llm-shearing.
Generally how much GPU compute do you use for training?
from llm-shearing.
I didn't run any LoRA experiments for the Sheared-LLaMA models. For getting Sheared-LLaMA models, we used full parameter tuning and we used 8 GPUs for pruning and 16 GPUs for continued pre-training.
from llm-shearing.
Related Issues (20)
- Could you provide tokenized continue-pretraining dataset for reproduction? HOT 2
- missmatch shape
- Start training but nothing continue HOT 6
- TypeError: buffer is too small for requested array
- Pruning fine-tuned model HOT 2
- save model meet problem HOT 1
- Instruction tuning dataset HOT 2
- If I can't configure Slurm on a cluster, does that mean I can't use multi-node multi-GPU setups? HOT 5
- 有没有不用Slurm跑剪枝的方法?
- None
- Start training but only output config information HOT 3
- The Project is not implemented for 70B llama? HOT 7
- LlamaRMSNorm() layer differs from original llama HOT 1
- composer model trans to pythia problem
- The dtype of tokenized data should be uint32 HOT 1
- Why the rope params are ignored while converting hf checkpoint to composer checkpoint? HOT 3
- about shearing params config HOT 1
- Can LLM-Shearing be used on ViT models? HOT 1
- Support for Llama-3 / GQA? HOT 1
- Open source the pruning mask. HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llm-shearing.