Comments (7)
Hi @MatthieuToulemont, the Triton TRT-LLM container is a special container that only contains TRT-LLM backend and Python backend. If you'd like to have other backends, you could try with either way
- copy over the backends in the regular Triton container from
/opt/tritonserver/backends/*
- build the container following the steps here.
from server.
Ok, is the python backend from the trt-llm different from the one of 24.05-py3 ?
from server.
No, the Python Backend should be the same.
from server.
No, the Python Backend should be the same.
Does the 24.05-py3 contain backends of ONNX、TensorRT and TorchScript?
from server.
@tricky61 The nvcr.io/nvidia/tritonserver:24.05-py3
container contains ONNX, TRT and PyTorch backends. The nvcr.io/nvidia/tritonserver:24.05-trtllm-python-py3
only has TRTLLM and Python backends.
from server.
@tricky61 The
nvcr.io/nvidia/tritonserver:24.05-py3
container contains ONNX, TRT and PyTorch backends. Thenvcr.io/nvidia/tritonserver:24.05-trtllm-python-py3
only has TRTLLM and Python backends.
ok. I am using nvcr.io/nvidia/tritonserver:24.05-vllm-python-py3
I also use tritonserver:23.11 and add vllm backend manually.
I will try to add vllm backends to nvcr.io/nvidia/tritonserver:24.05-py3 manually.
Does this method make difference?because the nvcr.io/nvidia/tritonserver:24.05-vllm-python-py3 is 11.2 GB and nvcr.io/nvidia/tritonserver:24.05-py3 costs 7.55GB
from server.
@tricky61 It shouldn't make any difference. Note that you'd have to pip install vllm
and make sure model.py
exists under /opt/tritonserver/backends/vllm_backend
.
from server.
Related Issues (20)
- Add environment variable that allows you to append a prefix to all HTTP requests
- Is there a way to make the output buffer use the existing space?
- Understanding and customize the vLLM backend HOT 4
- Issue on page /user_guide/model_configuration.html HOT 3
- Issue while setting up ONNX RUNTIME BACKEND natively on Windows 10. HOT 1
- COUNTER in Custom metrics has no initial value. HOT 3
- A fluctuating result is obtained when perf_analyze is run for a pressure test HOT 2
- When will the latest version be released? HOT 2
- [New] Discord channel for triton-inference-server, tensorrt
- Not able to perform multiple inferences on my ASR model HOT 1
- Unable to find backend library for backend 'onnxruntime' HOT 2
- Is inferencing natively with C++ natively supported in Triton For Windows version 2.47 and ONNX backend? (Without GRPC and HTTPs calls. HOT 1
- Triton considers max_batch_size as a number of channels for a given input image HOT 1
- Metric Endpoint not working for custom backend. HOT 8
- Tensorflow 2.16 / Keras 3 support
- Add Triton Backend: MindSpore
- Streaming output is not working with the GRPC client HOT 1
- TRITON with Pytorch CPU only build not working HOT 13
- Issue while loading the model using TIS (Triton Inference Server) : For the model to support batching the shape should have at least 1 dimension and the first dimension must be -1 HOT 1
- support auto padding for tensorflow_backend
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from server.