Comments (2)
Not sure if this is a good idea, since the build process takes too long. The Docker image is supposed to pull aphrodite from PyPi.
The build times have, however, been reduced significantly with #130, but that's on dev branch, which is slightly unstable at the moment.
from aphrodite-engine.
If you want to reduce the build process, then start caching. If that is not possible, then try to cache as much layers as possible and build your image from there.
If you fork the project, make changes and then try to build a docker container, it always pulls from PyPI. If you try to build a container for ARM64 devices, this will fail.
My proposal that fullfills your request is then simple: Make sure that all dependencies are compiled upfront, so that it starts from an uncompiled app. If that is not possible, create a dockerfile that builds all dependencies and another one that just pulls the app.
This will reduce time to mere seconds of compilation (accounting for the fact that all images are downloaded upfront)
from aphrodite-engine.
Related Issues (20)
- Problem with request (before 0.5 works with no problem) HOT 2
- Load part of GGUF to GPU and CPU? HOT 1
- `RuntimeError: CUDA unknown error` on Runpod (but works fine on local machine) HOT 2
- Initial fetch for `config.json` ignores `--revision`? HOT 3
- Bad generation with GGUF and OpenAI api HOT 1
- [Bug]: openAI endpoint crashing on "no locator available" HOT 1
- [Bug]: Pydantic serializer issue when pinging /v1/models HOT 2
- [Bug]: `ValueError: Out of range float values are not JSON compliant` when requesting logprobs from awq model HOT 1
- [sparsetral and Qwen2idae]: support for mixtral of lora HOT 12
- [Bug]: exl2 is not auto detected HOT 2
- [Usage]: nccl and cupy problem "no cupy" and "NCCL_ERROR_UNHANDLED_CUDA_ERROR" when use TP in wsl HOT 10
- [Bug]: Issue when trying to load a AWQ model with --load-in-4bits for mixtral flavors HOT 3
- Installation fails on NAVI gpu HOT 2
- [Bug]: loading model with int8 kv cache chokes HOT 1
- [Usage]: Question about VRAM requirement and temperature HOT 2
- [Feature]: Support YiForCausalLM HOT 5
- [Misc]: Building docker container requires insane amount of memory HOT 7
- [Bug]: Outlines json guided decoding HOT 2
- [Feature]: BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences HOT 1
- [Bug]: Does --trust-remote-code work? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aphrodite-engine.