Comments (11)
I would use a built-in array module and provide shape and type as separate argument for maximum flexibility.
However, along with that, we should have a capability to return those types as results. And right now, we only return numpy arrays.
I think OrtValue based interfaces can help us. Simpe run() returns numpy.
However, we can add methods to OrtValue to accept other types in python array and convert the result to python arrays as well.
from onnxruntime.
ORT supports bfloat16 internally. I would think a separate per type approach is not the best way to go. Rather, I would implement a new universal method that does not depend on numpy so much, and also include other types, such as quantized 8 and 4.
from onnxruntime.
I suggest using run_with_iobinding and make a helper to use run(sess, None, feeds) where is a dictionary dict[str, torch.Tensor].
from onnxruntime.
IOBinding has its own specifics as to when the input is actually copied to the device.
Also it is cumbersome for every day use.
from onnxruntime.
Thoughts on supporting dlpack protocols? This way we don't need to know if the inputs are torch or numpy or anything else. The data can be shared using the same protocol.
from onnxruntime.
Thoughts on supporting dlpack protocols? This way we don't need to know if the inputs are torch or numpy or anything else. The data can be shared using the same protocol.
Does it not come with its own set of data type enums that seem to be even more restrictive than numpy?
from onnxruntime.
Hmm. I tested it and there are constraints. dlpack supports all numpy types except for strings. It supports bfloat16 additionally, but not float8 types. Although as a ML data exchange protocol I can foresee float8 support being added.
A major advantage is that it is supported by all major frameworks (numpy, torch, jax, tensorflow, mlx etc.) so ORT can be compatible with many different tensors easily.
from onnxruntime.
import torch
import jax
t = torch.tensor(42, dtype=torch.bfloat16)
jax.dlpack.from_dlpack(t)
Array(42, dtype=bfloat16)
from onnxruntime.
We can dlpack support, in fact, I have seen some code in pybind referring to dlpack, but I am not seeing it support what we need. Should we not design something that we can own, maintain and deliver what we need?
from onnxruntime.
It seems, we can extend numpy types support for our needs. See for example a project that implements bfloat16 type and makes it 1st class citizen for numpy. https://github.com/GreenWaves-Technologies/bfloat16/tree/main
from onnxruntime.
There is also https://github.com/jax-ml/ml_dtypes (from Google) that implements all relevant types, which we use in the ONNX IR. Upstreaming seems challenging based on the context this project gave.
from onnxruntime.
Related Issues (20)
- export an onnx graph for a single decoder layer for LLAMA HOT 1
- Regression: Torch exported Onnx doesn't run after Onnxruntime>=1.17 update - [ShapeInferenceError] HOT 3
- [Build] Cannot build onnxruntime_optimizer HOT 13
- [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for DeformConv(19) node with name 'p2o.DeformConv.0' HOT 1
- [Build] passing --arm64 to ci_build/build.py has error in arm64 host HOT 3
- [Build] quantization unittest failed when run all tests
- Please Add webpack and typescript configuration HOT 2
- How to use I/O binding if input tensor shape is not fixed HOT 2
- [Build] wheels of 1.17/1.18 not found installing with uv HOT 3
- [Mobile] android prod crash: signal 11 (SIGSEGV), code 1 (SEGV_MAPERR) HOT 2
- [Feature Request] prebuilt package with cudnn 9 support
- When will onnxruntime make it available to get nodeunit inputs correctly for QLinearConcat HOT 4
- [Build] support for CPython 3.13.0b1
- [Feature Request] why is `FunctionProto` missing from the file "onnxruntime/core/providers/shared_library/provider_wrappedtypes.h"? HOT 3
- [Build] Dockerfile.tensorrt out of date HOT 3
- [Web] I canβt use onnruntime-web to load a onnx model in a react web HOT 5
- Unauthorized for `onnxruntime-cuda-12/pypi/simple/`
- [Documentation] The documentation for early versions is missing HOT 2
- pip install failure for onnxruntime==1.17.3 HOT 1
- Index put loop model regression with ort==1.18
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from onnxruntime.