Hi, thanks for sharing the great work. I'm wondering if this work supports compres

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Does this work support compression and acceleration for mt0-xl or gpt model? about llm-awq HOT 2 OPEN

mit-han-lab commented on July 20, 2024

Does this work support compression and acceleration for mt0-xl or gpt model?

from llm-awq.

Comments (2)

tonylins commented on July 20, 2024

Hi, thanks for the interests in our work! It should theoretically support different LLM designs, but we need to implement the support (just a few lines of code). Please refer to awq/quantize/pre_quant.py and awq/quantize/auto_scale.py

from llm-awq.

songkq commented on July 20, 2024

@tonylins Thanks.
When I install the efficient W4A16 CUDA kernel with python setup.py install, it requires -std=c++14 instead of -std=c++17 to be set in setup.py.

extra_compile_args = {
    "cxx": ["-g", "-O3", "-fopenmp", "-lgomp", "-std=c++14"],
    "nvcc": ["-O3", "-std=c++14", "-keep"],
}

However, it also failed with the errors when using -std=c++14. Which Pytorch version do you use? Could you please give some advice?

/usr/local/anaconda3/envs/awq/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/nn/cloneable.h:58:59: error: invalid static_cast from type ‘const torch::OrderedDict<std::basic_string<char>, at::Tensor>’ to type ‘torch::OrderedDict<std::basic_string<char>, at::Tensor>&’
/usr/local/anaconda3/envs/awq/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/nn/cloneable.h:71:61: error: invalid static_cast from type ‘const torch::OrderedDict<std::basic_string<char>, std::shared_ptr<torch::nn::Module> >’ to type ‘torch::OrderedDict<std::basic_string<char>, std::shared_ptr<torch::nn::Module> >&’

The root cause maybe that a new gcc version (>5.4) is required. As for gcc 5.4.0, the issue can be solved by modifying the /usr/local/anaconda3/envs/awq/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/nn/cloneable.h.

copy->parameters_.size() == this -> parameters_.size()
copy->buffers_.size() == this -> buffers_.size()
copy->children_.size() == this -> children_.size()

from llm-awq.

Recommend Projects

Does this work support compression and acceleration for mt0-xl or gpt model? about llm-awq HOT 2 OPEN

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent