Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

This is the full log <div class="snippet-clipboard-content notranslate position-re

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

[YOLOV7-QAT] Cannot convert onnx to trt engine about yolo_deepstream HOT 4 OPEN

HoangTienDuc commented on June 4, 2024

[YOLOV7-QAT] Cannot convert onnx to trt engine

from yolo_deepstream.

Comments (4)

HoangTienDuc commented on June 4, 2024

This is the full log

&&&& RUNNING TensorRT.trtexec [TensorRT v8500] # /usr/src/tensorrt/bin/trtexec --onnx=qat_models/trained_qat/pgie/1/qat.onnx --int8 --fp16 --workspace=1024000 --minShapes=images:4x3x416x416 --optShapes=images:4x3x416x416 --maxShapes=images:4x3x416x416
[12/04/2023-09:06:56] [W] --workspace flag has been deprecated by --memPoolSize flag.
[12/04/2023-09:06:56] [I] === Model Options ===
[12/04/2023-09:06:56] [I] Format: ONNX
[12/04/2023-09:06:56] [I] Model: qat_models/trained_qat/pgie/1/qat.onnx
[12/04/2023-09:06:56] [I] Output:
[12/04/2023-09:06:56] [I] === Build Options ===
[12/04/2023-09:06:56] [I] Max batch: explicit batch
[12/04/2023-09:06:56] [I] Memory Pools: workspace: 1.024e+06 MiB, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
[12/04/2023-09:06:56] [I] minTiming: 1
[12/04/2023-09:06:56] [I] avgTiming: 8
[12/04/2023-09:06:56] [I] Precision: FP32+FP16+INT8
[12/04/2023-09:06:56] [I] LayerPrecisions: 
[12/04/2023-09:06:56] [I] Calibration: Dynamic
[12/04/2023-09:06:56] [I] Refit: Disabled
[12/04/2023-09:06:56] [I] Sparsity: Disabled
[12/04/2023-09:06:56] [I] Safe mode: Disabled
[12/04/2023-09:06:56] [I] DirectIO mode: Disabled
[12/04/2023-09:06:56] [I] Restricted mode: Disabled
[12/04/2023-09:06:56] [I] Build only: Disabled
[12/04/2023-09:06:56] [I] Save engine: 
[12/04/2023-09:06:56] [I] Load engine: 
[12/04/2023-09:06:56] [I] Profiling verbosity: 0
[12/04/2023-09:06:56] [I] Tactic sources: Using default tactic sources
[12/04/2023-09:06:56] [I] timingCacheMode: local
[12/04/2023-09:06:56] [I] timingCacheFile: 
[12/04/2023-09:06:56] [I] Heuristic: Disabled
[12/04/2023-09:06:56] [I] Preview Features: Use default preview flags.
[12/04/2023-09:06:56] [I] Input(s)s format: fp32:CHW
[12/04/2023-09:06:56] [I] Output(s)s format: fp32:CHW
[12/04/2023-09:06:56] [I] Input build shape: images=4x3x416x416+4x3x416x416+4x3x416x416
[12/04/2023-09:06:56] [I] Input calibration shapes: model
[12/04/2023-09:06:56] [I] === System Options ===
[12/04/2023-09:06:56] [I] Device: 0
[12/04/2023-09:06:56] [I] DLACore: 
[12/04/2023-09:06:56] [I] Plugins:
[12/04/2023-09:06:56] [I] === Inference Options ===
[12/04/2023-09:06:56] [I] Batch: Explicit
[12/04/2023-09:06:56] [I] Input inference shape: images=4x3x416x416
[12/04/2023-09:06:56] [I] Iterations: 10
[12/04/2023-09:06:56] [I] Duration: 3s (+ 200ms warm up)
[12/04/2023-09:06:56] [I] Sleep time: 0ms
[12/04/2023-09:06:56] [I] Idle time: 0ms
[12/04/2023-09:06:56] [I] Streams: 1
[12/04/2023-09:06:56] [I] ExposeDMA: Disabled
[12/04/2023-09:06:56] [I] Data transfers: Enabled
[12/04/2023-09:06:56] [I] Spin-wait: Disabled
[12/04/2023-09:06:56] [I] Multithreading: Disabled
[12/04/2023-09:06:56] [I] CUDA Graph: Disabled
[12/04/2023-09:06:56] [I] Separate profiling: Disabled
[12/04/2023-09:06:56] [I] Time Deserialize: Disabled
[12/04/2023-09:06:56] [I] Time Refit: Disabled
[12/04/2023-09:06:56] [I] NVTX verbosity: 0
[12/04/2023-09:06:56] [I] Persistent Cache Ratio: 0
[12/04/2023-09:06:56] [I] Inputs:
[12/04/2023-09:06:56] [I] === Reporting Options ===
[12/04/2023-09:06:56] [I] Verbose: Disabled
[12/04/2023-09:06:56] [I] Averages: 10 inferences
[12/04/2023-09:06:56] [I] Percentiles: 90,95,99
[12/04/2023-09:06:56] [I] Dump refittable layers:Disabled
[12/04/2023-09:06:56] [I] Dump output: Disabled
[12/04/2023-09:06:56] [I] Profile: Disabled
[12/04/2023-09:06:56] [I] Export timing to JSON file: 
[12/04/2023-09:06:56] [I] Export output to JSON file: 
[12/04/2023-09:06:56] [I] Export profile to JSON file: 
[12/04/2023-09:06:56] [I] 
[12/04/2023-09:06:56] [I] === Device Information ===
[12/04/2023-09:06:56] [I] Selected Device: NVIDIA GeForce RTX 3060
[12/04/2023-09:06:56] [I] Compute Capability: 8.6
[12/04/2023-09:06:56] [I] SMs: 28
[12/04/2023-09:06:56] [I] Compute Clock Rate: 1.777 GHz
[12/04/2023-09:06:56] [I] Device Global Memory: 12041 MiB
[12/04/2023-09:06:56] [I] Shared Memory per SM: 100 KiB
[12/04/2023-09:06:56] [I] Memory Bus Width: 192 bits (ECC disabled)
[12/04/2023-09:06:56] [I] Memory Clock Rate: 7.501 GHz
[12/04/2023-09:06:56] [I] 
[12/04/2023-09:06:56] [I] TensorRT version: 8.5.0
[12/04/2023-09:06:56] [I] [TRT] [MemUsageChange] Init CUDA: CPU +11, GPU +0, now: CPU 24, GPU 801 (MiB)
[12/04/2023-09:06:57] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +421, GPU +114, now: CPU 497, GPU 915 (MiB)
[12/04/2023-09:06:57] [I] Start parsing network model
[12/04/2023-09:06:58] [I] [TRT] ----------------------------------------------------------------
[12/04/2023-09:06:58] [I] [TRT] Input filename:   qat_models/trained_qat/pgie/1/qat.onnx
[12/04/2023-09:06:58] [I] [TRT] ONNX IR version:  0.0.7
[12/04/2023-09:06:58] [I] [TRT] Opset version:    13
[12/04/2023-09:06:58] [I] [TRT] Producer name:    pytorch
[12/04/2023-09:06:58] [I] [TRT] Producer version: 1.13.0
[12/04/2023-09:06:58] [I] [TRT] Domain:           
[12/04/2023-09:06:58] [I] [TRT] Model version:    0
[12/04/2023-09:06:58] [I] [TRT] Doc string:       
[12/04/2023-09:06:58] [I] [TRT] ----------------------------------------------------------------
[12/04/2023-09:06:58] [E] [TRT] ModelImporter.cpp:740: While parsing node number 467 [QuantizeLinear -> "onnx::DequantizeLinear_924"]:
[12/04/2023-09:06:58] [E] [TRT] ModelImporter.cpp:741: --- Begin node ---
[12/04/2023-09:06:58] [E] [TRT] ModelImporter.cpp:742: input: "model.51.cv1.conv.weight"
input: "onnx::QuantizeLinear_921"
input: "onnx::QuantizeLinear_1885"
output: "onnx::DequantizeLinear_924"
name: "QuantizeLinear_467"
op_type: "QuantizeLinear"
attribute {
  name: "axis"
  i: 0
  type: INT
}

[12/04/2023-09:06:58] [E] [TRT] ModelImporter.cpp:743: --- End node ---
[12/04/2023-09:06:58] [E] [TRT] ModelImporter.cpp:746: ERROR: builtin_op_importers.cpp:1192 In function QuantDequantLinearHelper:
[6] Assertion failed: scaleAllPositive && "Scale coefficients must all be positive"
[12/04/2023-09:06:58] [E] Failed to parse onnx file
[12/04/2023-09:06:58] [I] Finish parsing network model
[12/04/2023-09:06:58] [E] Parsing model failed
[12/04/2023-09:06:58] [E] Failed to create engine from model or file.
[12/04/2023-09:06:58] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8500] # /usr/src/tensorrt/bin/trtexec --onnx=qat_models/trained_qat/pgie/1/qat.onnx --int8 --fp16 --workspace=1024000 --minShapes=images:4x3x416x416 --optShapes=images:4x3x416x416 --maxShapes=images:4x3x416x416

from yolo_deepstream.

hopef commented on June 4, 2024

Scale coefficients must all be positive occurs when the stored scale value is zero. This is a bug in pytorch_quantization_library. it can be fixed by constraining the value of amax(like, amax.clamp(1e-6)) when export to onnx.
tensor_quantizer.py

from yolo_deepstream.

hopef commented on June 4, 2024

also, you can change the scale value using onnx in Python.

from yolo_deepstream.

Doctor-L-end commented on June 4, 2024

@hopef 我也遇到了这个问题，但加上您在上面提到的amax.clap(1e-6)后仍然报错。
[01/12/2024-11:24:30] [E] [TRT] ModelImporter.cpp:771: While parsing node number 175 [QuantizeLinear -> "/model.7/conv/_weight_quantizer/QuantizeLinear_output_0"]:
[01/12/2024-11:24:30] [E] [TRT] ModelImporter.cpp:772: --- Begin node ---
[01/12/2024-11:24:30] [E] [TRT] ModelImporter.cpp:773: input: "model.7.conv.weight"
input: "/model.7/conv/_weight_quantizer/Constant_output_0"
input: "/model.7/conv/_weight_quantizer/Constant_1_output_0"
output: "/model.7/conv/_weight_quantizer/QuantizeLinear_output_0"
name: "/model.7/conv/_weight_quantizer/QuantizeLinear"
op_type: "QuantizeLinear"
attribute {
name: "axis"
i: 0
type: INT
}

[01/12/2024-11:24:30] [E] [TRT] ModelImporter.cpp:774: --- End node ---
[01/12/2024-11:24:30] [E] [TRT] ModelImporter.cpp:777: ERROR: builtin_op_importers.cpp:1197 In function QuantDequantLinearHelper:
[6] Assertion failed: scaleAllPositive && "Scale coefficients must all be positive"
[01/12/2024-11:24:30] [E] Failed to parse onnx file
[01/12/2024-11:24:30] [I] Finished parsing network model. Parse time: 0.0231789
[01/12/2024-11:24:30] [E] Parsing model failed
[01/12/2024-11:24:30] [E] Failed to create engine from model or file.
[01/12/2024-11:24:30] [E] Engine set up failed

from yolo_deepstream.

[YOLOV7-QAT] Cannot convert onnx to trt engine about yolo_deepstream HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent