Git Product home page Git Product logo

Comments (5)

malfet avatar malfet commented on September 22, 2024 2

Adding triage review to discuss what do do about various performance issue and how we should approach them

from pytorch.

tringwald avatar tringwald commented on September 22, 2024 2

By using .sched_setaffinity() you are basically telling your OS to only allow the current process to use certain CPU cores. torch will use torch.get_num_threads() (usually CPU cores/2) threads for computation by default. So in your example you will force the OS to schedule N threads on N-1 cores which probably leads to some very heavy congestion.
You should probably use torch.set_num_threads() to limit the number of threads torch will use.

from pytorch.

malfet avatar malfet commented on September 22, 2024

It feels like this issue belong to https://discuss.pytorch.org/

Though if I'm to rephrase it, it probably about documenting what are the interoperability expectations between PyTorch runtime (which uses OpenMP) and Python threading library

Also, I wonder if torch.set_num_threads(len(core_ids)) will help

from pytorch.

ye3084 avatar ye3084 commented on September 22, 2024

@tringwald @malfet
Thank you for your response. I used torch.set_num_threads() and it solved the problem, but I found that when I use more than 8 cores, the thread execution time increases. It seems that using 8 cores gives the best performance. I've created a chart where the X-axis represents the number of cores used and the Y-axis shows the execution time for the threads.
img

The results I obtained on another server with a 72-core CPU are as follows (this server is shared with other users, which might affect performance).
img2

from pytorch.

tringwald avatar tringwald commented on September 22, 2024

Your CPU only has 8 (or 36) physical cores, with hyperthreading/SMT that makes 16/72 logical cores. SMT only really helps when waiting for something like IO. In a heavy number crunching task like NN inference, there is no real advantage of using logical cores. Having multiple threads fight for the same physical core only leads to congestion and cache eviction.

from pytorch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.