Are the models you report in your readme supposed to be actual 2 bit models or just 2.

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Actual bitrate of models on github? about aqlm HOT 5 CLOSED

tsengalb99 commented on August 21, 2024

Actual bitrate of models on github?

from aqlm.

Comments (5)

Vahe1994 commented on August 21, 2024

Hello!
The average bits for a 1x16 setup with the Llama-2-7b model is around 2.29 bits due to the overhead on codebook size. The same configuration for the 70b model gives around 2.07 average bits. Using 2x8 setup on the 7B model gives approximately 2.006 bits. In all our experiments, the default group size is 8. The significant difference between 1x16 and 2x8 model sizes is because of codebook size overhead. For more details, please see Appendix G in the paper "Estimating model size" at https://arxiv.org/pdf/2401.06118. Additionally, we do not quantize the LM head.

from aqlm.

tsengalb99 commented on August 21, 2024

Hi Vage, Are you scaling and shifting each of these groups of 8 or just treating them directly as vectors for vector quantization? Thanks From: Egiazarian Vage ***@***.***> Sent: Tuesday, May 14, 2024 3:55 AM To: Vahe1994/AQLM ***@***.***> Cc: Albert Tseng ***@***.***>; Author ***@***.***> Subject: Re: [Vahe1994/AQLM] Actual bitrate of models on github? (Issue #88) Hello! The average bits for a 1x16 setup with the Llama-2-7b model is around 2.29 bits due to the overhead on codebook size. The same configuration for the 70b model gives around 2.07 average bits. Using 2x8 setup on the 7B model gives approximately 2.006 bits. In all our experiments, the default group size is 8. The significant difference between 1x16 and 2x8 model sizes is because of codebook size overhead. For more details, please see Appendix G in the paper "Estimating model size" at https://arxiv.org/pdf/2401.06118. Additionally, we do not quantize the LM head. — Reply to this email directly, view it on GitHub <#88 (comment)> , or unsubscribe <https://github.com/notifications/unsubscribe-auth/AH6WZSHV4WDZZOT324SM4CLZCG7NVAVCNFSM6AAAAABHTAR3ECVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBZGUYTSNBZGI> . You are receiving this because you authored the thread.Message ID: ***@***.***>

from aqlm.

BlackSamorez commented on August 21, 2024

@tsengalb99
We are not. We use row-wise scales and offsets. 8 is a vector quantization group size.

from aqlm.

github-actions commented on August 21, 2024

This issue is stale because it has been open for 30 days with no activity.

from aqlm.

github-actions commented on August 21, 2024

This issue was closed because it has been inactive for 14 days since being marked as stale.

from aqlm.

Recommend Projects

Actual bitrate of models on github? about aqlm HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent