Comments (4)
The attached Phi-3 ONNX model is not shape inferred for all the operators. Couple of operators might have symbolic shape inferenced with dynamic axes. The vast majority of the operators are not shape inferenced. For example, here is one of the subgraphs of the model that is not shape inferenced, visualized in Netron. From my understanding, this is how Netron visualizes shape inferenced operators after running the model through the
SymbolicShapeInference.infer_shapes
tool, which I was not able to do for the phi-3 model (this subgraph is from a different onnx model).
You can find the shape inference by clicking on the operator and pressing the '+' icon next to the right of each input name and output name. Here is an example.
I do not see MatMulNBits operator in the list of supported operators you shared for the SymbolicShapeInference.infer_shapes tool, which might be a reason why SymbolicShapeInference.infer_shapes tool is giving out the error
Yes, your error occurs because symbolic shape inference for MatMulNBits
isn't implemented in SymbolicShapeInference.infer_shapes
. We can add MatMulNBits
to fix this.
Were you able to successfully shape infer the phi-3 model for all operators? I am not able to do it with release version of onnxruntime 1.18.0. Which version of onnxruntime are you using?
The uploaded Phi-3 ONNX models are created via ONNX Runtime GenAI's model builder. The shape inferences for their operators are created here in the model builder using onnx.helper.make_tensor_value_info
and added to the ModelProto
here.
from onnxruntime.
@kunal-vaishnavi, could you take a look at symbolic shape inference works on phi-3 models.
from onnxruntime.
The uploaded Phi-3 ONNX models already have been symbolic shape inferenced with dynamic axes.
The symbolic shape inference for most quantization operators is defined in each operator's spec.
onnxruntime/onnxruntime/core/graph/contrib_ops/contrib_defs.cc
Lines 3434 to 3481 in 6baaaf5
Here is the list of supported operators whose shapes can be symbolically inferred in the SymbolicShapeInference.infer_shapes
tool.
onnxruntime/onnxruntime/python/tools/symbolic_shape_infer.py
Lines 127 to 247 in 6baaaf5
from onnxruntime.
@kunal-vaishnavi, thanks for the response. I have a few questions and comments from my side:
- The attached Phi-3 ONNX model is not shape inferred for all the operators. Couple of operators might have symbolic shape inferenced with dynamic axes. The vast majority of the operators are not shape inferenced. For example, here is one of the subgraphs of the model that is not shape inferenced, visualized in Netron:
From my understanding, this is how Netron visualizes shape inferenced operators after running the model through the SymbolicShapeInference.infer_shapes
tool, which I was not able to do for the phi-3 model (this subgraph is from a different onnx model):
-
I do not see
MatMulNBits
operator in the list of supported operators you shared for theSymbolicShapeInference.infer_shapes
tool, which might be a reason whySymbolicShapeInference.infer_shapes
tool is giving out the error -
Were you able to successfully shape infer the phi-3 model for all operators? I am not able to do it with release version of onnxruntime 1.18.0. Which version of onnxruntime are you using?
from onnxruntime.
Related Issues (20)
- Model saved with offline basic optimizations will not load - ShapeInferenceError HOT 1
- [Training] [ShapeInferenceError] Dimension could not be inferred: incompatible shapes
- [Build] How can I quantize the llama3 model activation to int4 ?
- [Feature Request] ORT-Profiler: Include timestamps for tensor allocations and deallocations. HOT 2
- header files path not recognized or unable to read header file HOT 1
- [Build] AllocatorTest.CUDAAllocatorFallbackTest failed HOT 1
- [Performance] Get nan value when I block all the node in fp16 conversion HOT 8
- [Bug] The per_tensor quantized weight type of matmul is wrong HOT 1
- ONNX Runtime 1.18.1 CUDA 12.4 cuDNN 9.2 breaks inference with repeated inputs when enable_mem_reuse is enabled
- Latest Release(1.18.1) Java Artifacts Unavailable HOT 1
- [Build] C++ API cannot be reliably linked with an program using CMake
- [BUG] CANN: onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError]
- [Build] Cross compilation of the library for ARMv7 32bit target with gcc 8.3 HOT 4
- CUDA 12 and session.get_providers() not showing CUDAExecutionProvider HOT 9
- [Web] Memory access out of bounds / alignment fault
- An error occurred when I installed onnxruntime-qnn in an Arm environment HOT 3
- [Performance] Multiple Sessions on Same GPU is very slow
- [Models larger than 2GB :(] Specify mid-graph.output after initializing InferenceSession HOT 2
- [Error] [ONNXRuntimeError] : 1 : FAIL : CUDA failure 3: initialization error HOT 4
- [Build] long paths in NuGet package breaking build on Windows
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from onnxruntime.