Git Product home page Git Product logo

Comments (2)

aarnphm avatar aarnphm commented on August 15, 2024
Usage: openllm build [OPTIONS] {flan-t5|dolly-v2|chatglm|starcoder|falcon|stablelm|opt|mpt}

  Package a given models into a Bento.

  $ openllm build flan-t5 --model-id google/flan-t5-large

  > NOTE: To run a container built from this Bento with GPU support, make sure
  > to have https://github.com/NVIDIA/nvidia-container-toolkit install locally.

Options:
  --model-id TEXT                 Optional model_id name or path for (fine-tune) weight.
  -o, --output [json|pretty|porcelain]
                                  Showing output type.  [env var: OPENLLM_OUTPUT; default: pretty]
  --overwrite                     Overwrite existing Bento for given LLM if it already exists.
  --workers-per-resource FLOAT    Number of workers per resource assigned. See
                                  https://docs.bentoml.org/en/latest/guides/scheduling.html#resource-scheduling-
                                  strategy for more information. By default, this is set to 1.
                                  
                                  NOTE: The workers value passed into 'build' will determine how the LLM can be
                                  provisioned in Kubernetes as well as in standalone container. This will ensure it
                                  has the same effect with 'openllm start --workers ...'
  Optimisation options.: [mutually_exclusive]
    --quantize [int8|int4|gptq]   Set quantization mode for serving in deployment.
                                  
                                  GPTQ is currently working in progress and will be available soon.
                                  
                                  NOTE: Quantization is only available for PyTorch models.
    --bettertransformer           Apply FasterTransformer wrapper to serve model. This will applies during serving
                                  time.
  --enable-features FEATURE[,FEATURE]
                                  Enable additional features for building this LLM Bento. Available: mpt, fine-tune,
                                  chatglm, agents, flan-t5, playground, starcoder, openai, falcon
  --adapter-id [PATH | [remote/][adapter_name:]adapter_id][, ...]
                                  Optional adapters id to be included within the Bento. Note that if you are using
                                  relative path, '--build-ctx' must be passed.
  --build-ctx TEXT                Build context. This is required if --adapter-id uses relative path
  --model-version TEXT            Model version provided for this 'model-id' if it is a custom path.
  --dockerfile-template FILENAME  Optional custom dockerfile template to be used with this BentoLLM.
  Miscellaneous options: 
    -q, --quiet                   Suppress all output.
    --debug, --verbose            Print out debug logs.
    --do-not-track                Do not send usage info
  -h, --help                      Show this message and exit.

from openllm.

aarnphm avatar aarnphm commented on August 15, 2024

Sorry for the late reply, but any updates on this? Feel free to reopen if you still running into this issue.

I can build mpt with openllm build (tested on linux and mac)

from openllm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.