Great work! I have a question about the scales in the repo. Can you explain why we

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Why scales need to be transformed by sqrt(scales.max() * scales.min())? about llm-awq HOT 4 CLOSED

mit-han-lab commented on July 20, 2024

Why scales need to be transformed by sqrt(scales.max() * scales.min())?

from llm-awq.

Comments (4)

tonylins commented on July 20, 2024 2

Hi, we only care about the relative values of scaling across channels. Multiplying the entire scaling factor vector by a fixed number does not affect the accuracy. However, it helps with numerical stability (a more proper range). Let me know if you have more questions.

from llm-awq.

tonylins commented on July 20, 2024 1

Hi @MarsJacobs , the most important factor being the accuracy is higher. Our intuition is different compared to SmoothQuant: SmoothQuant wants to preserve the activation outliers for W8A8 quantization; we only want to introduce activation-awareness to the weights. Therefore, we want to use the average to reflect the overall effect from different tokens.

from llm-awq.

MarsJacobs commented on July 20, 2024

Thank you for the further clarification! I have one question. In the AWQ implementation, is there a particular reason for using the abs().mean() of weight and activation to explore the scale value? (For comparison, I understand that SmoothQuant utilizes the max value of weight and activation.)

Additional explanations would be greatly helpful in gaining a deeper understanding of AWQ. Thanks in advance!

from llm-awq.

MarsJacobs commented on July 20, 2024

Thank you for your response. Considering the motivational difference between SmoothQuant's shifting activation outliers and AWQ's activation-aware weight scaling, the reasoning behind the abs().mean() implementation becomes much clearer. Thank you for sharing your great work and for providing further answers!

from llm-awq.

Recommend Projects

Why scales need to be transformed by sqrt(scales.max() * scales.min())? about llm-awq HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent