Git Product home page Git Product logo

Comments (5)

cuviper avatar cuviper commented on July 18, 2024

I don't there can be a standard suggestion, because it depends on both the number of elements and the number of threads in the pool, at least. In theory it will have logarithmic depth, but work-stealing will lengthen that further.

I wonder if we need to be smarter about MAX_SEQUENTIAL here:

// If both partitions are up to this length, we continue sequentially. This number is as small
// as possible but so that the overhead of Rayon's task scheduling is still negligible.
const MAX_SEQUENTIAL: usize = 2000;

i.e. rather than just a fixed constant, maybe that should also consider num_elem / num_threads and some additional factor to account for imbalance. Then it should use a higher limit when you're starting with billions, and you won't get so much stack use due to work stealing.


Similarly, the stable sort has two constants, CHUNK_SIZE while splitting and MAX_SEQUENTIAL while merging:

// The length of initial chunks. This number is as small as possible but so that the overhead
// of Rayon's task scheduling is still negligible.
const CHUNK_LENGTH: usize = 2000;

// Slices whose lengths sum up to this value are merged sequentially. This number is slightly
// larger than `CHUNK_LENGTH`, and the reason is that merging is faster than merge sorting, so
// merging needs a bit coarser granularity in order to hide the overhead of Rayon's task
// scheduling.
const MAX_SEQUENTIAL: usize = 5000;

from rayon.

vigna avatar vigna commented on July 18, 2024

Well, what if I know the number of elements, their size, and the number of threads in the pool? Just ballpark.

from rayon.

cuviper avatar cuviper commented on July 18, 2024

I don't know. You'll get a better answer if you run a few experiments yourself.

from rayon.

norabelrose avatar norabelrose commented on July 18, 2024

I'm running into the same issue and have found that around 5-10B elements the default of 2MB seems to break, so I'm testing out a heuristic of using max[1, log2(N / 5e9)] * 2 MB right now.
(this is sorting a memory mapped slice of u64 on a machine with 48 physical cores)

from rayon.

norabelrose avatar norabelrose commented on July 18, 2024

Update: I actually ran into a bus error (core dumped) when using the logarithmic scaling. Linear scaling would sort of defeat the purpose of memory mapping the data, so I might try square root scaling.

from rayon.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.