Git Product home page Git Product logo

Comments (6)

cuviper avatar cuviper commented on July 1, 2024 1

That's worrisome, but I'm afraid I don't have any Apple hardware to test this myself. Hopefully others in the community can share their experience and help debug what's going on.

One small tip -- if you haven't set num_threads manually, the RAYON_NUM_THREADS environment variable will also override the default setting.

from rayon.

cuviper avatar cuviper commented on July 1, 2024 1

I think you'll need to figure out what that System time is actually doing, because that looks pathological. Does Xcode profiling or anything like that reveal System details?

from rayon.

xiyu1984 avatar xiyu1984 commented on July 1, 2024

I found the reason may relate to the competition for CPU resources between the work-stealing strategy of rayon and the system.

Again, note that Everything worked well on the previous versions of MacOS before Sonoma 14.4.

When not manually setting the num_threads(...), the default num_threads is 14 (m3 max 14 + 16). In this case, the system will "rob" the CPU resources back, and the "robbing" itself is costly.

Then I limited the num_threads to 4 as follows:

rayon::ThreadPoolBuilder::new().num_threads(4).build_global().unwrap();

The system is still "robbing", but the user process can use these 4 threads most of the time. And in my program's case, the performance improves although it's still much slower than before as the CPU cores are not fully exploited. This is just a temporary solution.

from rayon.

xiyu1984 avatar xiyu1984 commented on July 1, 2024

I think the problem may not be all related to rayon.

From my experience until now, the number of threads needs to be limited below the number of cores. The details are as follows:

  • if I limit the num_threads to 4, parallel works stably.
  • if I limit the num_threads to 8, parallel works stably sometimes, but there's a chance to be "robbed".
  • if I limit the num_threads to 12, parallel works stably sometimes, but there's a higher chance to be "robbed".

Maybe the larger num_threads be used, the higher the probability of being "robbed".

And this is how the resource was "robbed" by the system:
image

and maybe this is why 4 num_threads can work.

from rayon.

xiyu1984 avatar xiyu1984 commented on July 1, 2024

I think you'll need to figure out what that System time is actually doing, because that looks pathological. Does Xcode profiling or anything like that reveal System details?

Now I just checked the information in the activity monitor, and as you said, there were pathological and conflict phenomena.
The picture reveals the system takes the CPU resources away, but the cost of each process shows it's my process that takes the most CPU resources. But I'm sure that my process is slowed down so the CPU resources are not computing it.

Anyway, I will look into this problem more deeply soon according to your suggestion.

from rayon.

xiyu1984 avatar xiyu1984 commented on July 1, 2024

Things are clearer.

I made a deeper profiling and found that with a higher parallel, my process needs more memory, and then the security checking in the kernel is raised, which is costly.

This might be confirmed by https://appleinsider.com/articles/24/03/21/apple-silicon-vulnerability-leaks-encryption-keys-and-cant-be-patched-easily

Luckily, rayon still works well.

from rayon.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.