Describe the issue Testing ONNXRuntime 1.18 with TensorRT EP eithe

Hi,I tried with trt_max_workspace_size<<a href="https://onnxruntime

I tried with trt_max_workspace_size<a href="https://onnxruntime.ai/docs/e

ONNXRuntime 1.18 crashing with TensorRT EP when dealing with big inputs about onnxruntime HOT 5 OPEN

sansrem commented on July 19, 2024

ONNXRuntime 1.18 crashing with TensorRT EP when dealing with big inputs

from onnxruntime.

Comments (5)

chilo-ms commented on July 19, 2024

The error message "TensorRT EP failed to create engine from network" indicates something went wrong when TRT EP is calling
TRT's api buildSerializedNetwork() and since it happens when dealing with large image, i'm suspecting it's due to OOM.

Could you increase the trt_max_workspace_size to see? The default is 1 GB.

Also, quick question, can you repro the issue using trtexec?

from onnxruntime.

sansrem commented on July 19, 2024

Hi, I tried with trt_max_workspace_size<https://onnxruntime.ai/docs/execution-providers/TensorRT-ExecutionProvider.html#trt_max_workspace_size> set to 2G, 4G, 8G with the same result getting also this additional warning if it is set greater than 1G 024-06-13 07:26:15.840575541 [W:onnxruntime:CF, tensorrt_execution_provider.cc:1479 TensorrtExecutionProvider] [TensorRT EP] TensorRT option trt_max_workspace_size must be a positive integer value. Set it to 1073741824 (1GB) Not really familiar with trtexec, I tried by just specifying the onnx model and it failed with [06/13/2024-17:20:39] [E] [TRT] ModelImporter.cpp:732: ERROR: builtin_op_importers.cpp:4531 In function importSlice: [8] Assertion failed: (axes.allValuesKnown()) && "This version of TensorRT does not support dynamic axes." [06/13/2024-17:20:39] [E] Failed to parse onnx file [06/13/2024-17:20:39] [I] Finish parsing network model [06/13/2024-17:20:39] [E] Parsing model failed [06/13/2024-17:20:39] [E] Failed to create engine from model or file. [06/13/2024-17:20:39] [E] Engine set up failed I used TensorRT 8.5.3 in this case. From: Chi Lo ***@***.***> Sent: Wednesday, June 12, 2024 12:38 PM To: microsoft/onnxruntime ***@***.***> Cc: Mathieu Sansregret ***@***.***>; Author ***@***.***> Subject: Re: [microsoft/onnxruntime] ONNXRuntime 1.18 crashing with TensorRT EP when dealing with big inputs (Issue #21001) EXTERNAL EMAIL : Do not click any links or open any attachments unless you trust the sender and know the content is safe. The error message "TensorRT EP failed to create engine from network" indicates something went wrong when TRT EP is calling TRT's api buildSerializedNetwork() and since it happens when dealing with large image, i'm suspecting it's due to OOM. Could you increate the trt_max_workspace_size<https://onnxruntime.ai/docs/execution-providers/TensorRT-ExecutionProvider.html#trt_max_workspace_size> to see? The default is 1 GB. Also, quick question, can you repro the issue using trtexec? - Reply to this email directly, view it on GitHub<#21001 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AG7CMVD4ST7IDP4Y3IDRPMLZHB2N3AVCNFSM6AAAAABJEWIWWCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRTGQ3TKMRZGU>. You are receiving this because you authored the thread.Message ID: ***@***.******@***.***>>

from onnxruntime.

chilo-ms commented on July 19, 2024

I tried with trt_max_workspace_sizehttps://onnxruntime.ai/docs/execution-providers/TensorRT-ExecutionProvider.html#trt_max_workspace_size> set to 2G, 4G, 8G with the same result getting also this additional warning if it >is set greater than 1G

Hmm that's strange. Could you share the code that set trt_max_workspace_size?
Please see the example code here.

As for trtexec, some models are not fully TRT eligible, it seems that's the case of your model, so trtexec won't be able to run them. How about trtexec with TRT 10?
Could you share the proxy model so that we can repro from our side? Or could you point to public model that can repro the issue.

from onnxruntime.

sansrem commented on July 19, 2024

Found the problem on my side for trt_max_workspace_size, re-validated with 2G, 4G and 8G Still getting 2024-06-14 11:34:30.389829469 [W:onnxruntime:CF, tensorrt_execution_provider.h:84 log] [2024-06-14 15:34:30 WARNING] Skipping tactic 0x0000000000000000 due to exception autotuning: CUDA error 2 allocating 6370102777-byte buffer: out of memory 2024-06-14 11:34:30.480769226 [E:onnxruntime:CF, tensorrt_execution_provider.h:82 log] [2024-06-14 15:34:30 ERROR] 4: [optimizer.cpp::computeCosts::3726] Error Code 4: Internal Error (Could not find any implementation for node {ForeignNode[onnx::Cast_507[Constant]...Concat_372]} due to insufficient workspace. See verbose log for requested sizes.) 2024-06-14 11:34:30.520078719 [E:onnxruntime:CF, tensorrt_execution_provider.h:82 log] [2024-06-14 15:34:30 ERROR] 2: [builder.cpp::buildSerializedNetwork::751] Error Code 2: Internal Error (Assertion engine != nullptr failed. ) 2024-06-14 11:34:30.520215247 [E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running TRTKernel_graph_torch_jit_16074816800397161377_0 node. Name:'TensorrtExecutionProvider_TRTKernel_graph_torch_jit_16074816800397161377_0_0' Status Message: TensorRT EP failed to create engine from network. We already sent the model (FLMFRIFE_Untrained.onnx) to a member of the ONNXRuntime team : Scott McKay. From: Chi Lo ***@***.***> Sent: Thursday, June 13, 2024 7:48 PM To: microsoft/onnxruntime ***@***.***> Cc: Mathieu Sansregret ***@***.***>; Author ***@***.***> Subject: Re: [microsoft/onnxruntime] ONNXRuntime 1.18 crashing with TensorRT EP when dealing with big inputs (Issue #21001) EXTERNAL EMAIL : Do not click any links or open any attachments unless you trust the sender and know the content is safe. I tried with trt_max_workspace_sizehttps://onnxruntime.ai/docs/execution-providers/TensorRT-ExecutionProvider.html#trt_max_workspace_size> set to 2G, 4G, 8G with the same result getting also this additional warning if it >is set greater than 1G Hmm that's strange. Could you share the code that set trt_max_workspace_size? Please see the example code here<https://onnxruntime.ai/docs/execution-providers/TensorRT-ExecutionProvider.html#click-below-for-c-api-example>. As for trtexec, some models are not fully TRT eligible, so trtexec won't be able to run them. Could you share the proxy model so that we can repro from our side? Or could you point to public model that can repro the issue. - Reply to this email directly, view it on GitHub<#21001 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AG7CMVGFQECYT4KO6ZR6OM3ZHIVTHAVCNFSM6AAAAABJEWIWWCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRWHE3DQMRQGQ>. You are receiving this because you authored the thread.Message ID: ***@***.******@***.***>>

from onnxruntime.

ONNXRuntime 1.18 crashing with TensorRT EP when dealing with big inputs about onnxruntime HOT 5 OPEN

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent