Comments (5)
It's mostly due to the QuIP# kernels. I'll look into extending support to P100s (we used to support them before) tomorrow.
from aphrodite-engine.
It's mostly due to the QuIP# kernels. I'll look into extending support to P100s (we used to support them before) tomorrow.
Ah I see. So for now it doesn't work only when using Quip# kernels? I was thinking if it was as easy as changing the setup.py and the other quantization would work then it's a non-issue. Just wanted to make sure if it will work at all or if there is a big change in aphrodite as a whole that makes it not work with P100s.
I'm going to put together either a 4xP100 or 4xP40 system to test out the larger models and higher context size models that just came out, so I am just trying to make sure the stuff I want to run on them works first lol. The Tesla P100 are a great deal because they're 16GB cards that has over 2x the bandwidth of the P40 cards. Although if speed is no concern, I guess the P40 are a better deal with 24GBs.
Currently Aphrodite is working great on my 2x3090 so thanks for your work on this project!
from aphrodite-engine.
I did try myself on the dev branch, but I'm waaaay out of my depth. I got it to build using the runtime and exporting TORCH_CUDA_ARCH_LIST="6.0 6.1 7.0 7.5 8.0 8.6 8.9 9.0+PTX" , but actually trying to load up a model results in "RuntimeError: CUDA error: no kernel image is available for execution on the device". As near as I understand, pytorch does still ship with kernels for the P100, though, so I'm unsure what's going wrong here.
from aphrodite-engine.
Related Issues (20)
- [Bug]: WSL Cuda out of Memory when Trying to Load GGUF Model HOT 8
- [Usage]: load-in-4bit not load after converted, and it seem not use swap well
- [Bug]: KV Cache and Max Tokens - Lack of Consistency
- [Feature]: Add support for DBRX model HOT 2
- [Bug]: Exllama v2 not working HOT 11
- [Feature]: Add support for Qwen2MoE HOT 1
- [Feature]: Add support for Command-r HOT 2
- [Feature]: actual working health endpoint HOT 2
- [Feature]: any workarounds for cc 6.0? HOT 2
- [Bug]: served-model-name is unused HOT 1
- [Installation]: No module named 'aphrodite._C' HOT 2
- [Crash]: Program gets terminated HOT 1
- [Bug]: Converting gguf to state_dict HOT 3
- [Bug]: manually setting --max-model-len flag always leads to OOM, even if it is set very low HOT 2
- [Bug]: gguf loading failed. config.json? HOT 4
- [Feature]: Support hqq quantize method.
- [Bug]: Mixtral-8x22b-instruct not running with AWQ HOT 10
- [Feature]: Provide configuration via env vars or a configuration file
- [Usage]: odd use of GPUS number and tensor parallelism HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aphrodite-engine.