Git Product home page Git Product logo

Comments (5)

hsb1995 avatar hsb1995 commented on August 21, 2024

w=16,a=16
I can obtain the uncompressed values of w=16 and a=16. But once the compression value is set(w=6,a=6), problems arise
image

from omniquant.

ChenMnZ avatar ChenMnZ commented on August 21, 2024

@hsb1995
LLaMA-3-8B uses GQA (Group Query Attention), which is not supported by current ‘let’.

from omniquant.

hsb1995 avatar hsb1995 commented on August 21, 2024

@hsb1995 LLaMA-3-8B uses GQA (Group Query Attention), which is not supported by current ‘let’.

Professor, thank you for your full work. I really don't know how GQA is handled as you mentioned

Can I understand what you said that I kept the original "generateAct_scale.shift" file unchanged to obtain the "act_scales" and "act_shifts" files.
And then I will do our weight quantification for processing?
Parameter settings:
CUDA_VISIBLE_DEVICES=0 python main.py
--model /PATH/TO/LLaMA/llama-8b
--epochs 20 --output_dir ./log/llama-8b-w6a6
--eval_ppl --wbits 6 --abits 6 --lwc
Is the above operation possible?
I only deleted the let operation.

from omniquant.

hsb1995 avatar hsb1995 commented on August 21, 2024

Hey, professor. I gave it a try.
It's really difficult to change. The current errors are as follows. What should I do when encountering these?

[2024-04-24 17:14:17 root](omniquant.py 50): INFO Starting ...
Some weights of LlamaForCausalLM were not initialized from the model checkpoint at /home/sam/Doctorproject/weight/llama-3-8b/LLM-Research/Llama-3-8b/ and are newly initialized: ['model.layers.17.self_attn.rotary_emb.inv_freq', 'model.layers.1.self_attn.rotary_emb.inv_freq', 'model.layers.3.self_attn.rotary_emb.inv_freq', 'model.layers.4.self_attn.rotary_emb.inv_freq', 'model.layers.16.self_attn.rotary_emb.inv_freq', 'model.layers.31.self_attn.rotary_emb.inv_freq', 'model.layers.21.self_attn.rotary_emb.inv_freq', 'model.layers.10.self_attn.rotary_emb.inv_freq', 'model.layers.24.self_attn.rotary_emb.inv_freq', 'model.layers.28.self_attn.rotary_emb.inv_freq', 'model.layers.11.self_attn.rotary_emb.inv_freq', 'model.layers.13.self_attn.rotary_emb.inv_freq', 'model.layers.14.self_attn.rotary_emb.inv_freq', 'model.layers.15.self_attn.rotary_emb.inv_freq', 'model.layers.2.self_attn.rotary_emb.inv_freq', 'model.layers.20.self_attn.rotary_emb.inv_freq', 'model.layers.27.self_attn.rotary_emb.inv_freq', 'model.layers.0.self_attn.rotary_emb.inv_freq', 'model.layers.7.self_attn.rotary_emb.inv_freq', 'model.layers.6.self_attn.rotary_emb.inv_freq', 'model.layers.9.self_attn.rotary_emb.inv_freq', 'model.layers.29.self_attn.rotary_emb.inv_freq', 'model.layers.26.self_attn.rotary_emb.inv_freq', 'model.layers.22.self_attn.rotary_emb.inv_freq', 'model.layers.19.self_attn.rotary_emb.inv_freq', 'model.layers.12.self_attn.rotary_emb.inv_freq', 'model.layers.8.self_attn.rotary_emb.inv_freq', 'model.layers.30.self_attn.rotary_emb.inv_freq', 'model.layers.25.self_attn.rotary_emb.inv_freq', 'model.layers.5.self_attn.rotary_emb.inv_freq', 'model.layers.18.self_attn.rotary_emb.inv_freq', 'model.layers.23.self_attn.rotary_emb.inv_freq']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Traceback (most recent call last):
File "/home/sam/Doctorproject/OmniQuant-main/main.py", line 419, in
main()
File "/home/sam/Doctorproject/OmniQuant-main/main.py", line 383, in main
omniquant(
File "/home/sam/Doctorproject/OmniQuant-main/quantize/omniquant.py", line 102, in omniquant
raise ValueError("Only support for opt/llama/Llama-2/Llama-3/falcon/mixtral now")
ValueError: Only support for opt/llama/Llama-2/Llama-3/falcon/mixtral now

from omniquant.

kimoji919 avatar kimoji919 commented on August 21, 2024

@ChenMnZ hello,I also meet some problems like this.
I've tried your code in runing_falcon180b_on_single_a100_80g.ipynb with llama2-7b. Do quant and do save with real quant.However,while Loading pre-computed quantized weights,it returns warning like this,
image
and fail while exec code model = model.cuda().
bug like this
image
I also try your weight in huggingface,but seems it does not work.

from omniquant.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.