Describe the issue use shape_inference.quant_pre_process to prepro

Error in quantize vicuna-7b model from fp16 to int8 about onnxruntime HOT 5 OPEN

JackWeiw commented on July 18, 2024

Error in quantize vicuna-7b model from fp16 to int8

from onnxruntime.

Comments (5)

xadupre commented on July 18, 2024

Did you try to see if it works with onnxruntime==1.18?

from onnxruntime.

JackWeiw commented on July 18, 2024

Did you try to see if it works with onnxruntime==1.18?

I switch to onnxruntime==1.18, there it still return the same error when i try to pre-process

if i simply use quantize_dynamic, it works fine, but it fails to check_model

I set op_version as default(14) when export from PyTorch, my torch version is torch2.3-cu11.8.
Do you have any insights?

from onnxruntime.

xadupre commented on July 18, 2024

Are you using the latest onnx package?

from onnxruntime.

JackWeiw commented on July 18, 2024

Are you using the latest onnx package?

I have updated onnx to 1.16.1, onnxruntime to 1.18.0, than it succeed in quantization

howerver, when i tried to run it in onnxruntime, it report

from onnxruntime.

github-actions commented on July 18, 2024

This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

from onnxruntime.

Error in quantize vicuna-7b model from fp16 to int8 about onnxruntime HOT 5 OPEN

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent