Git Product home page Git Product logo

Comments (4)

ccccjunkang avatar ccccjunkang commented on July 19, 2024
20240428162830167

from onnxruntime.

yuslepukhin avatar yuslepukhin commented on July 19, 2024

Each session creates a ThreadPool that is optimized to run on all cores. That TP is used for intra op parallelization. Meaning many CPU kernels attempt to distribute the work across all the cores for a give session.

Multiple sessions would create lots of contention and context switches.

You may want experiment with your model and set different number of threads for IntraOp thread pool.
Or you can make all the sessions use a global threadpool. See if this can make things faster.

For CUDA though, disabling the threadpool altogether by setting the number of threads to 1 seems to be the best option for each of the sessions since CUDA kernels are not using CPU thread pools.

from onnxruntime.

souptc avatar souptc commented on July 19, 2024

hi, which kernel do you have in your model?
due to implementation reason, in ORT's cuda ep, we do have some kernels which currently doesn't support launch them in parallel, for example: the conv kernel

Just double check whether that is the case.

from onnxruntime.

ccccjunkang avatar ccccjunkang commented on July 19, 2024

Thank you for your reply. @souptc
In this case, there is no conv kernel in the model, matmul and elementwise kernel are used. The reason for using multiple sessions is to create multiple streams on the GPU, so it can launch kernel Concurrently. Is this lock caused by the cuda runtime.
I set a cuda context for each thread to avoid the lock, but it does not take effect.

from onnxruntime.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.