Your current environment <div class="snippet-clipboard-content notranslate posit

[Bug]: bnb quant load error about aphrodite-engine HOT 3 OPEN

theobjectivedad commented on July 26, 2024

[Bug]: bnb quant load error

from aphrodite-engine.

Comments (3)

AlpinDale commented on July 26, 2024 2

Might be a good idea to keep it open in case someone else has the same issue. I'll close it myself once we have real bitsandbytes support.

from aphrodite-engine.

AlpinDale commented on July 26, 2024

The quant name in aphrodite is unfortunately a bit misleading - I intend to fix this with the next release. The load_in_4bit quant isn't actually bitsandbytes, it's SmoothQuant+. We don't allow loading bnb weights directly yet. This will also be addressed with the next release.

Note that SQ+ is faster and offers better quality than bnb 4bit. bnb reduces throughput compared to fp16, while sq+ increases it by close to 3x.

from aphrodite-engine.

theobjectivedad commented on July 26, 2024

Got it, ty for looking at this and helping me understand. Do you want me to close this issue?

from aphrodite-engine.

[Bug]: bnb quant load error about aphrodite-engine HOT 3 OPEN

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent