Comments (2)
Usage: openllm build [OPTIONS] {flan-t5|dolly-v2|chatglm|starcoder|falcon|stablelm|opt|mpt}
Package a given models into a Bento.
$ openllm build flan-t5 --model-id google/flan-t5-large
> NOTE: To run a container built from this Bento with GPU support, make sure
> to have https://github.com/NVIDIA/nvidia-container-toolkit install locally.
Options:
--model-id TEXT Optional model_id name or path for (fine-tune) weight.
-o, --output [json|pretty|porcelain]
Showing output type. [env var: OPENLLM_OUTPUT; default: pretty]
--overwrite Overwrite existing Bento for given LLM if it already exists.
--workers-per-resource FLOAT Number of workers per resource assigned. See
https://docs.bentoml.org/en/latest/guides/scheduling.html#resource-scheduling-
strategy for more information. By default, this is set to 1.
NOTE: The workers value passed into 'build' will determine how the LLM can be
provisioned in Kubernetes as well as in standalone container. This will ensure it
has the same effect with 'openllm start --workers ...'
Optimisation options.: [mutually_exclusive]
--quantize [int8|int4|gptq] Set quantization mode for serving in deployment.
GPTQ is currently working in progress and will be available soon.
NOTE: Quantization is only available for PyTorch models.
--bettertransformer Apply FasterTransformer wrapper to serve model. This will applies during serving
time.
--enable-features FEATURE[,FEATURE]
Enable additional features for building this LLM Bento. Available: mpt, fine-tune,
chatglm, agents, flan-t5, playground, starcoder, openai, falcon
--adapter-id [PATH | [remote/][adapter_name:]adapter_id][, ...]
Optional adapters id to be included within the Bento. Note that if you are using
relative path, '--build-ctx' must be passed.
--build-ctx TEXT Build context. This is required if --adapter-id uses relative path
--model-version TEXT Model version provided for this 'model-id' if it is a custom path.
--dockerfile-template FILENAME Optional custom dockerfile template to be used with this BentoLLM.
Miscellaneous options:
-q, --quiet Suppress all output.
--debug, --verbose Print out debug logs.
--do-not-track Do not send usage info
-h, --help Show this message and exit.
from openllm.
Sorry for the late reply, but any updates on this? Feel free to reopen if you still running into this issue.
I can build mpt with openllm build
(tested on linux and mac)
from openllm.
Related Issues (20)
- feat: support enforce_eager option from cli HOT 1
- bug: Cannot Run an OpenLLM server regardless of where I try to get it from or what model I use HOT 6
- bug: Attempting to invoke OpenLLM from Langchain results in error HOT 2
- Availability of the OpenAI /v1/completions API Endpoint ? HOT 3
- feat: Include starcoder2
- How to deploy a model using a single machine multi card approach? HOT 1
- Documentation HOT 1
- feat: add gemma2 HOT 1
- Can openllm support local path model? HOT 10
- feat: support Qwen1.5 HOT 2
- feat: any plan to support NPU HOT 1
- bug: An exception occurred while instantiating runner 'llm-mistral-runner' HOT 2
- bug: Not enough data for satisfy transfer length header HOT 3
- feat: Can you support llama3? HOT 3
- bug: WARNING: openllm 0.4.44 does not provide the extra 'gemma' HOT 1
- feat: support LMDeploy backend HOT 7
- bug: error coming up while install the vllm using pip install "openllm[vllm]" HOT 1
- For AMD/GPU, how to use multi GPUS in the api_server.py HOT 2
- bug: pip package version ssues
- feat: Multimodal LLMs? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from openllm.