Describe the issue I need to bind tensor input and output using I/

ONNX I/O Binding about onnxruntime HOT 5 CLOSED

suhailes1 commented on August 19, 2024

ONNX I/O Binding

from onnxruntime.

Comments (5)

tianleiwu commented on August 19, 2024 1

(1) Cuda graph requires inputs binded to a fixed buffer in GPU (so you will need copy input to the same address in GPU memory for every run). In your case, the input is binded to CPU:

Ort::MemoryInfo memory_info = Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeCPU);

(2)
Memory is deleted before reading the data to a vector:

cudaFree(output_data_ptr); //suhail
...
std::vector output(rawOutput, rawOutput + count);

Please follow the following examples to use IO/Binding and CUDA Graph:

onnxruntime/onnxruntime/test/shared_lib/test_inference.cc

Line 2074 in 068bb3d

TEST(CApiTest, io_binding_cuda) {

onnxruntime/onnxruntime/test/shared_lib/test_inference.cc

Line 2177 in 068bb3d

TEST(CApiTest, basic_cuda_graph) {

from onnxruntime.

yuslepukhin commented on August 19, 2024

Where do you populate input/output names?

from onnxruntime.

suhailes1 commented on August 19, 2024

Hi yuslepukhin,
please check below snippet. i got the input and output name from there

Ort::Session session(env, model_path.c_str(), session_options);

// Create an allocator object based on default options to provide memory allocation functions for subsequent operations
Ort::AllocatorWithDefaultOptions allocator;

// Get the number of input nodes
size_t num_input_nodes = session.GetInputCount();

// Get the number of output nodes
size_t num_output_nodes = session.GetOutputCount();

// Get input node name and dimensions
for (int i = 0; i < num_input_nodes; i++) {
    auto input_name = session.GetInputNameAllocated(i, allocator);
    input_node_names.push_back(input_name.get());
    Ort::TypeInfo input_typeinfo = session.GetInputTypeInfo(i);
    auto input_tensorinfo = input_typeinfo.GetTensorTypeAndShapeInfo();
    auto input_dims = input_tensorinfo.GetShape();

    ONNXTensorElementDataType inputType = input_tensorinfo.GetElementType();

    if (input_dims.at(0) == IMR_ERROR)
    {
        std::cout << "[Warning] Got dynamic batch size. Setting output batch size to "
                << BATCH_SIZE << "." << std::endl;
        input_dims.at(0) = BATCH_SIZE;
    }

    input_node_dims.push_back(input_dims);

    std::cout << "[INFO] Input name and shape is: " << input_name.get() << " [";
    for (size_t j = 0; j < input_dims.size(); j++) {
        std::cout << input_dims[j];
        if (j != input_dims.size()-1) {
            std::cout << ",";
        }
    }
    std::cout << ']' << std::endl;
}

// Get output node name
std::vector <vector <int64_t>> output_node_dims;
for (int i = 0; i < num_output_nodes; i++) {
    auto output_name = session.GetOutputNameAllocated(i, allocator);
    output_node_names.push_back(output_name.get());
    Ort::TypeInfo output_typeinfo = session.GetOutputTypeInfo(i);
    auto output_tensorinfo = output_typeinfo.GetTensorTypeAndShapeInfo();
    auto output_dims = output_tensorinfo.GetShape();

    if (output_dims.at(0) == IMR_ERROR)
    {
        std::cout << "[Warning] Got dynamic batch size. Setting output batch size to "
                << BATCH_SIZE << "." << std::endl;
        output_dims.at(0) = BATCH_SIZE;
    }

    output_node_dims.push_back(output_dims);

    std::cout << "[INFO] Output name and shape is: " << output_name.get() << " [";
    for (size_t j = 0; j < output_dims.size(); j++) {
        std::cout << output_dims[j];
        if (j != output_dims.size()-1) {
            std::cout << ",";
        }
    }
    std::cout << ']' << std::endl;
}

from onnxruntime.

suhailes1 commented on August 19, 2024

I refer the code from git. due to my project needs CUDA performance i need to add I/O Binding. so i gone through the reference from the [https://onnxruntime.ai/docs/performance/tune-performance/iobinding.html] ONNX Runtime site.
is there anything i miss from the I/O Binding logic?

from onnxruntime.

suhailes1 commented on August 19, 2024

Hi Tianlei Wu,

Thank u so much. I resolved my issue with the reference which you shared.

from onnxruntime.

ONNX I/O Binding about onnxruntime HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent