Hi! I'm having the following issue on the forward pass (only when using an AQLM model)

Please provide the following information: <code class="notrans

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Ah okay got it, thanks so much for your help! <span class="email-h

aqlm/inference_kernels/cuda_kernel.cu compilation errors about aqlm HOT 10 CLOSED

amrothemich commented on June 16, 2024

aqlm/inference_kernels/cuda_kernel.cu compilation errors

from aqlm.

Comments (10)

BlackSamorez commented on June 16, 2024

Please provide the following information:

aqlm version
torch version
CUDA version
Your GPU model

from aqlm.

amrothemich commented on June 16, 2024

Thanks!

aqlm==1.1.3
torch: 2.2.2+cu121
cuda: 12.1
GPU: Tesla V100 via Databricks: NCv3 VM https://learn.microsoft.com/en-us/azure/virtual-machines/ncv3-series

(And FYI using ISTA-DASLab/Mixtral-8x7B-Instruct-v0_1-AQLM-2Bit-1x16-hf, I'm still having the same issue)

from aqlm.

BlackSamorez commented on June 16, 2024

@amrothemich v100 doesn't really support efficient bf16 operations. If you were to update aqlm to the latest version it would perform a check for Compute Capability to display a more readable error.

from aqlm.

amrothemich commented on June 16, 2024

Okay got it, thanks. So is bf16 a requirement for the whole library or just this model?

from aqlm.

BlackSamorez commented on June 16, 2024

bfloat16 is not a requirement at all. You can pass torch_dtype=torch.float16 to from_pretrained to use standard half precision. It should work for any model no problem.

from aqlm.

BlackSamorez commented on June 16, 2024

Actually, float16 is the default for the models we put out. No need to specify it explicitly.
But first please update aqlm. It simply wouldn't compile otherwise.

from aqlm.

amrothemich commented on June 16, 2024

Yeah, I didn't specify in the original. Isn't 1.1.3 the latest version? I think it's the newest on pip, should I install from github?

…

On Tue, Apr 2, 2024 at 3:11 PM Andrei Panferov ***@***.***> wrote: Actually, float16 is the default for the models we put out. No need to specify it explicitly. But first please update aqlm. It simply wouldn't compile otherwise. — Reply to this email directly, view it on GitHub <#61 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGJCVQK5XB2SNAX5NSUQ2JDY3L7FTAVCNFSM6AAAAABFONH4ZSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZSHA3DKNJVGQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

from aqlm.

BlackSamorez commented on June 16, 2024

I see, I'm sorry. It's not that you didn't update, it's actually the opposite: I broke it in the latest release.
The latest dequantization kernels won't compile on GPUs with Compute Capability of 8 or less.
For now, you can downgrade to 1.1.2 and everything should work. I'll try and fix the error to allow you to use the latest dequantization kernels as well.

from aqlm.

amrothemich commented on June 16, 2024

Ah okay got it, thanks so much for your help!

…

On Tue, Apr 2, 2024 at 5:21 PM Andrei Panferov ***@***.***> wrote: I see, I'm sorry. It's not that you didn't update, it's actually the opposite: I broke it in the latest release. The latest dequantization kernels won't compile on GPUs with Compute Capability of 8 or less. For now, you can downgrade to 1.1.2 and everything should work. I'll try and fix the error to allow you to use the latest dequantization kernels as well. — Reply to this email directly, view it on GitHub <#61 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGJCVQJGQW6WTH23EVSGDOLY3MOM7AVCNFSM6AAAAABFONH4ZSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZTGEZDANJVGA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

from aqlm.

BlackSamorez commented on June 16, 2024

Please update to aqlm>=1.1.4 and it should resolve the issue.
Feel free to reopen it if it doesn't.

from aqlm.

aqlm/inference_kernels/cuda_kernel.cu compilation errors about aqlm HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent