Comments (3)
Hi @langong347, the underlying model itself needs to support batching in order for Triton to form batches for it.
Since the model has a fixed batch size of 1, Triton expects the same, so it won't support dynamic batching. You can either (a) keep the model as is and change the config to set the max_batch_size to 0 and add the "1" batch dimension to the input/output shapes, or (b) keep the config as is and regenerate the ONNX model with dynamic batch dimension.
from server.
from server.
Thank you @rmccorm4 ! (b) is what I've adopted since then, and it worked.
from server.
Related Issues (20)
- Ability to make preferred_batch_size mandatory
- When downloading, execute ./fetch_models.sh the report
- What is the latest triton server release version available for jetpack 4.6.4 HOT 3
- version inconsistency:Tensorrt and Triton images HOT 1
- triton server python backend how to support streaming transmission
- failed to load all models 22.02
- High Queue Latency With BLS HOT 1
- First invocation of model - Dynamic batching doesn't work - Python Backend
- Failed to unload model (vLLM Backend) after running inference in streaming mode
- Does triton inference server support customers custom feature but do not need to modify the origin code, like some plugin feature? HOT 2
- Is it possible to disable fallback on CPU? HOT 2
- [feature request] ffmpeg backend for simplifying decoding of audio/video inputs
- ./fetch_models.sh - unable to resolve host address
- Triton gives wrong output
- triton 24.08: "Poll failed for model directory 'ensemble': unexpected platform type 'ensemble' for ensemble" HOT 2
- UNAVAILABLE: Not found: unable to load shared library: %1 is not a valid Win32 application
- Can TIS run both vllm and torch backend together? HOT 2
- triton gpu deploy suddenly become very slow from 0.03s to 12s, how to solve it ?
- incompatible constructor arguments for c_python_backend_utils.InferenceRequest
- How many instances can Triton support for parallel inference at most?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from server.