Git Product home page Git Product logo

Comments (8)

nikomatsakis avatar nikomatsakis commented on July 16, 2024

I've been wanting for some time to do something standard like that, but also allow you to opt-in to a non-deterministic result in exchange for speed. The idea would be to use an AtomicBool which is set to true when an error occurs -- then, before we process each item, we check if an error has occurred and bail out if so.

from rayon.

cuviper avatar cuviper commented on July 16, 2024

See also the discussion in #51. This sort of early-exit can probably share common code.

from rayon.

cuviper avatar cuviper commented on July 16, 2024

BTW, if your types are already Result<(), E>, then that final .and(Ok(())) is redundant. That was just a quick way to map any arbitrary Ok(T) into nothing.

from rayon.

emk avatar emk commented on July 16, 2024

I ran a bit with the idea of converting existing for ... in and map(...).collect() iterations to parallel iterations using rayon, and I came up with two badly-named methods: reduce_automatically, which automatically reduces any type supporting an identity and an associative operator, and reduce_results_to_vec, which takes a parallel iterator over Result<T> and returns Result<Vec<T>> in some arbitrary order.

I have no idea how generally useful these might be, but I submit them as data points to your design process. Using these two reduction functions, I think I trivially convert most of the loops and iterations in cage to use rayon.

Thank you for a great library!

//! Extensions for `rayon`'s parallel iterators.

use rayon::prelude::*;

/// Wikipedia says: "In abstract algebra, a branch of mathematics, a monoid
/// is an algebraic structure with a single associative binary operation
/// and an identity element."
///
/// This trait represents values that have sensible "auto-reduce" behavior
/// when returned from a parallel computation.  We allow associative but
/// non-commutative reduce operators because it's often convenient to
/// collect results in a `Vec`, even if the order is arbitrary.
trait MonoidReduce {
    /// The identity element: This is what you want to return if you have a
    /// parallel computation with no inputs, just for example.
    fn default() -> Self;

    /// Given two values, reduce them to a single value.
    fn reduce(v1: Self, v2: Self) -> Self;
}

/// Reduce `()` to itself.
impl MonoidReduce for () {
    fn default() -> Self {
        ()
    }

    fn reduce(_: Self, _: Self) -> Self {
        ()
    }
}

/// Reduce `Vec` by combining two vectors into one.
impl<T> MonoidReduce for Vec<T> {
    fn default() -> Self {
        vec![]
    }

    fn reduce(mut v1: Self, mut v2: Self) -> Self {
        v1.append(&mut v2);
        v1
    }
}

/// Reduce `Result` by propagating failures.  If all results are
/// successful, do pair-wise reductions of the `Ok` values.
///
/// When https://github.com/nikomatsakis/rayon/issues/113 is fixed, there
/// should be a built-in API for reducing `Result` that aborts the
/// computation early.
impl<T: MonoidReduce, E> MonoidReduce for Result<T, E> {
    fn default() -> Self {
        Ok(T::default())
    }

    fn reduce(v1: Self, v2: Self) -> Self {
        match (v1, v2) {
            (Ok(v1), Ok(v2)) => Ok(T::reduce(v1, v2)),
            (err @ Err(_), _) |
            (_, err @ Err(_)) => err,
        }
    }
}

/// Extra methods that we can call on a `ParallelIterator` over a type
/// support `MonoidReduce`.
trait ParallelIteratorReduceAutomatically<T>: ParallelIterator<Item = T>
    where T: MonoidReduce + Send
{
    /// Return `Ok(())` if no result is an `Err`, or an
    /// arbitrarily-selected error if there are one or more error results.
    /// This is appropriate when we don't want to spam the user with
    /// multiple errors from the same run.
    fn reduce_automatically(self) -> T;
}

impl<T, I> ParallelIteratorReduceAutomatically<T> for I
    where T: MonoidReduce + Send,
          I: ParallelIterator<Item = T>
{
    fn reduce_automatically(self) -> T {
        self.reduce_with(T::reduce)
            .unwrap_or_else(T::default)
    }
}

#[test]
#[cfg_attr(feature="clippy", allow(let_unit_value))]
fn reduce_automatically_uses_monoid_reduce() {
    let _: () = vec![(), ()].par_iter().cloned().reduce_automatically();
    assert_eq!(vec![vec!(1), vec!(1)].par_iter().cloned().reduce_automatically(),
               vec![1, 1]);
    let ok: Result<(), &'static str> = Ok(());
    assert_eq!(vec![ok, ok].par_iter().cloned().reduce_automatically(),
               Ok(()));
    assert_eq!(vec![Err("!"), Ok(())].par_iter().cloned().reduce_automatically(),
               Err("!"));
}

/// Extra methods that we can call on a `ParallelIterator`.
trait ParallelIteratorReduceToResultsToVec<T, E>
    : ParallelIterator<Item = Result<T, E>>
    where T: Send,
          E: Send
{
    /// Collect the results of a parallel iteration as a `Vec` in an
    /// arbitrary order if all results succeed.
    fn reduce_results_to_vec(self) -> Result<Vec<T>, E>;
}

impl<T, E, I> ParallelIteratorReduceToResultsToVec<T, E> for I
    where T: Send,
          E: Send,
          I: ParallelIterator<Item = Result<T, E>>
{
    fn reduce_results_to_vec(self) -> Result<Vec<T>, E> {
        self.map(|result| result.map(|value| vec![value]))
            .reduce_automatically()
    }
}

#[test]
fn reduce_results_to_vec_returns_ok_on_success() {
    let result: Result<Vec<u8>, &'static str> = vec![1,1,1]
        .par_iter()
        .map(|v| Ok(v.to_owned()))
        .reduce_results_to_vec();
    assert_eq!(result.unwrap(), vec![1, 1, 1]);
}

#[test]
fn reduce_results_to_vec_returns_err_on_any_failure() {
    let result: Result<Vec<u8>, &'static str> = vec![1,1,1]
        .par_iter()
        .map(|_| Err("!"))
        .reduce_results_to_vec();
    assert!(result.is_err());
}

Note that if you use a wrapper type implementing MonoidReduce, you can reduce just about anything easily. And reduce_to_vec could be easily generalized to many more collection types with an extra trait.

from rayon.

nikomatsakis avatar nikomatsakis commented on July 16, 2024

Hmm so I guess we can't quite use find_any() for this; it's more like a collect() wrapper. But we're close. =)

from rayon.

cuviper avatar cuviper commented on July 16, 2024

The standard library has FromIterator for Option<C> and Result<C, E>, where C: FromIterator too, and these short-circuit on the first None or Err seen. Since #315 we've had the equivalent behavior with FromParallelIterator.

It's not a fully-general monoid reduction, but does this meet your original need?

Aside, it might be useful to allow collecting () from iterators of Item=(), both in std and rayon. This would essentially act like for_each, except you could combine it with the above Option and Result behavior too. I think I'll try it in std first...

from rayon.

emk avatar emk commented on July 16, 2024

It's not a fully-general monoid reduction, but does this meet your original need?

Yes, my specific use case is short-circuiting on errors, and just returning one. I don't need fully-general monoid reductions.

(I do admit a slight fondness for those basic abstract algebraic traits like "monoid" or "group". They're well-established mathematical abstractions, and they cover hundreds of use-cases precisely. But of course, in Rust, you'd probably want to rename them to something a bit friendlier, anyway. And here, we really prefer the short-circuiting behavior, anyway.)

from rayon.

cuviper avatar cuviper commented on July 16, 2024

OK, thanks!

from rayon.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.