Git Product home page Git Product logo

Comments (11)

DrTimothyAldenDavis avatar DrTimothyAldenDavis commented on August 14, 2024

The algorithms I'm using for reduce-to-vector would break if the operator is not associative and commutative. FIRST and SECOND wouldn't work at all. Slower methods could work, but then I would have to have two sets of reduce-to-vector methods: one where the operator is well-behaved, and a slower one where it isn't. That's a lot of extra code to support.

from graphblas.

eriknw avatar eriknw commented on August 14, 2024

Thanks. I understand this is the case for generic binary operators. If the internal data formats are sorted, I would expect there to be efficient specialized implementations for FIRST and SECOND.

Would you be okay with creating special implementations for these, or do you think this would get too messy? If you're okay with a specialized approach, would this be a good "first issue" to implement for new contributors?

from graphblas.

eriknw avatar eriknw commented on August 14, 2024

fwiw, I can't think of a meaningful example when restricted to the current spec where a Monoid can't be used instead of a BinaryOp. So, I think it's fine that you don't fully support reduce with BinaryOp, and perfectly understandable if you don't want to add special code to support a couple edge cases (in this case, FIRST and SECOND).

A better, longer term approach may be to address some of the deficiencies of using BinaryOps (which there are several). In particular, it would be great to have generic machinery in place that can perform a reduction with FIRST or SECOND in parallel. I believe this is doable, but we'd need to determine an API.

For another example, suppose one created a user-define type that is a struct. With generic, parallel machinery in place, I would expect one could perform a reduction to e.g. add all the elements of one of the fields in the struct.

from graphblas.

DrTimothyAldenDavis avatar DrTimothyAldenDavis commented on August 14, 2024

I think the API would be different. There's a similar issue that arises in the CC problem, where an assignment C(I) max= A is done, using the max as the accumulator operator. But in this case, if the index vector has duplicates, they want to apply the accum operator to reduce down to a final result. This is not something the C API for GraphBLAS defines. Using FIRST and SECOND for the accum operator is not unlike the kind of reduction you want to do, if C is a vector and A a matrix. These two cases are different, but they are related in changing what it means to do a reduction.

from graphblas.

eriknw avatar eriknw commented on August 14, 2024

That's an interesting case with "assign" with duplicate indices. I see you discuss this (or something very similar) in section 9.9. If I understand correctly, I think what you want here is an explicit "scatter" operation (to a list of indices) that accepts a binaryop/monoid to perform the reduction when values are scattered to the same index. This reduction could happen before the accumulation into the output, so, per normal GraphBLAS operations, accum could be a different binop. Fwiw, I would love to have a functional scatter as well where a function is used to determine which index a value is scattered to.

For interested readers (e.g., my future self): "build" is another place where reduction is kind of special. Could one expect to get better performance if a Monoid were accepted instead of a BinaryOp? SuiteSparse does accept FIRST and SECOND here and is well-defined where the spec is undefined if the binop is not associative or commutative.

from graphblas.

DrTimothyAldenDavis avatar DrTimothyAldenDavis commented on August 14, 2024

from graphblas.

DrTimothyAldenDavis avatar DrTimothyAldenDavis commented on August 14, 2024

I think what you really need is not a reduce-to-vector with FIRST and SECOND. That is ill-defined for any other binary op. What you really need is a kind of GxB_select that can depend on the entry count in a given row or column. "Give me just the first 4 entries in each row, starting from the lowest column indices". That step appears in CC, but GraphBLAS can't do it so the LAGraph CC exports the matrix and does it itself.

from graphblas.

eriknw avatar eriknw commented on August 14, 2024

It's good to bring select into the conversation, but I really would like a better reduction with binaryop. For example, performing a reduction on one field of a user-defined type that is a struct.

To do such a generic reduction well (and able to be done in parallel, so I will be talking about doing reductions in groups), I think at least four things are needed:

  • Y = unaryop(x): initialize a reduction with the first argument in a group. The output type Y may be different from the input type X.
  • Y = binaryop(Y, X): perform a step in the reduction within a group. The first argument is the previous result of the reduction within the group, and the second argument is the current value in the matrix.
  • Y = combine(Y, Y): a binaryop or monoid to combine results from groups.
  • is_commutative: whether the reduction commutes or not. If it doesn't commutes, then the combine steps must occur in order. Otherwise, groups may be combined in any order. This could probably be inferred/implied by whether the combine operator is a binaryop or monoid.

For example:

group1 = binop(binop(binop(unary(x1), x2), x3), x4)
group2 = binop(binop(binop(unary(x5), x6), x7), x8)
final = combine(group1, group2)

A further enhancement (that I'm not pushing for, but think is important to be aware of) would be to allow the type Y to be malloced, which would first happen in the unaryop. If it were malloced, then one would also need a finalize unaryop to perform the free and convert to a normal (fixed-size) GraphBLAS type. For completeness-sake, if malloc were used, then one could also provide a function to serialize and deserialize the object to send over the wire if necessary. Although complicated, this functionality would also be extremely powerful.

If order is important (i.e., for reductions that don't commute), then there is a natural order to do things: along a column, along a row, or along a vector. For matrix reduce to scalar, we'd need to choose a convention to linearize the elements (for example, like it were C-contiguous or row-wise), and setting the transpose descriptor on the input would choose the other option for linearizing a matrix.

FIRST and SECOND make perfect sense in view of this style of reduction. For these, the unaryop would simply be the identity, and the combine operator would be the same as the reducing binaryop (FIRST or SECOND).

If any of this functionality were available, I would surely use it. I don't need it though, and I think having this conversation is more important atm.

from graphblas.

eriknw avatar eriknw commented on August 14, 2024

I like that your select variant has a clean API that is easy to understand. Instead of operating on the absolute index I or J of an element, it operates on, say, ith and jth, which is the count of number of elements previously seen along this row, column, or vector. It's perhaps a little awkward if both ith and jth are needed. Otherwise, the API can match your select or the new select. I don't have a feel for how common or useful such an operation would be.

Regarding reduction on user-defined binaryops again, there are many more operations that could be done with my proposal above when combined with user-defined types:

  • Get up to the first k elements.
    • it may be difficult to stop the iteration once k elements are seen.
  • Get a random k elements.
  • Determine if rows are monotonically increasing or decreasing.
  • Calculate approximate percentiles.
  • Calculate approximate count distinct.
  • Compute a checksum.
  • Compute the sum and count at the same time
    • If there were a finalize step at the end, this could then compute the mean.
  • Similarly, if there were a finalize, one could first compute the count, sum(x), and sum(x**2), and then calculate the standard deviation from these all in a single pass.
    • We can take this further and use it to compute a linear regression!
  • Lots more: power mean, geometric mean, central mean (two passes), log-sum-exp, skewness (two passes), etc.

It is common in modern data analytics system to have user-defined aggregation functions in a form similar to initialize-update-merge-evaluate. Variations exist, and I should point to literature or documentation...

from graphblas.

DrTimothyAldenDavis avatar DrTimothyAldenDavis commented on August 14, 2024

from graphblas.

eriknw avatar eriknw commented on August 14, 2024

Thanks for taking the time to reply. I think I can do a better job explaining what I consider most important in this issue, so please indulge me as I summarize below. This is a long comment, so please take as much time as you like.

Summary

  1. Reductions with BinaryOps (esp. user-defined ones) as currently in the spec have serious deficiencies.
  2. SuiteSparse:GraphBLAS does a reasonable thing and only supports reductions with Monoids or builtin BinaryOps that have associated Monoids.
  3. Even though I know (2), I was surprised that reducing a Matrix to a Vector with FIRST or SECOND didn't work (we can call this a user error--my bad).
  4. Stronger than (3), I believe reductions with user-defined BinaryOps can be extremely powerful and is likely to be useful for some graph algorithms (and certainly useful for a general sparse linear algebra library).
  5. Hence, I think it's worth the time to try to come up with a better approach to perform user-defined reductions not limited to Monoids.
  6. I proposed an approach that I think fits within GraphBLAS and that I think addresses many of the existing shortcomings.
  7. I would like feedback. Is this approach feasible? Implementable? Mathematically sound? Complete? Too much? Can it be improved? Should I spend the time to link to supporting references?
  8. Is there any interest in me creating new function signatures or PoC implementations of my proposal?
  9. I will be watching for examples where this may come in handy!

The Proposal

The goal:

  • support reductions with BinaryOps (including user-defined ones)
  • that can express many operations (more than can be expressed with matrix multiply with different semirings)
  • and can be parallelized efficiently.

Reduction methods include Matrix to Vector, Matrix to Scalar, Vector to Scalar, and building Matrices and Vectors with dup_op (this last one may require separate attention). Because order may matter during the reduction, I will call the group of elements that is being reduced over a list (if order didn't matter, I would call this a multiset). Since we require the reduction to be parallelizable, we allow the list of elements to be partitioned into sub-lists that can be operated on independently.

Instead of a single BinaryOp, we require three operators:

  • Initialize with UnaryOp: applied to the first item in a sub-list.
    • output domain may be anything (i.e., it is not limited to be coercible to or from the input domain)
  • Update with BinaryOp: operates on the previous result and the next item in the sub-list.
    • output domain should match the output domain of UnaryOp.
  • Merge with BinaryOp or Monoid: combines results from sub-lists.
    • If BinaryOp: sub-list results must be combined in order (i.e., doesn't commute), and updates with BinaryOp must operate in order.
      • Input and output domains should match.
    • If Monoid: sub-list results may be combined in any order, and updates with BinaryOp may operate in any order.
    • Associativity is assumed for BinaryOp and Monoid; e.g., tree reductions are allowed.
    • Instead of having variants for both options--BinaryOp and Monoid--another option would be to only support BinaryOp with a boolean flag whether the operation commutes or not (i.e., needs to operate in order or not).

If there are k elements in a list being reduced, the UnaryOp may be called between 1 to k times (i.e., once per sub-list). The expectation is that the list will be partitioned into enough sub-lists to take advantage of parallelism. The UnaryOp will probably be the identity fairly often, but it may also be a relatively expensive operation, so we don't want to always call it on every element.

It's worth noting that there is no notion of identity of the missing elements or of the BinaryOp. Such an identity is strictly for Monoids and doesn't apply here. If the list being reduced is empty, then the result is empty. If it only has one element, then the result is the result of the UnaryOp.

I realize this would be a lot of extra code to support. I'm not asking you to implement it (a JIT would be much more amazing!), but I do think it's worth having the conversation and keeping an eye open for use cases.

Misc.

Yeah, I understand the behavior and pitfalls of the variant of select you brought up. I'll let you know if I find any more use cases for it.

I wasn't asking for mode or median. I was saying that if I were able to perform reductions with user-defined BinaryOps as per my proposal, then I could use it to efficiently compute approximate percentiles. And lots of other things in a single reduction operation.

Going back a few comments, you described a situation that comes up in CC. If I understand correctly, I recognize this operation as a "scatter" with aggregation. I would love it if an astute reader of this issue would give this more thought!

I'm sorry for repeating myself. There's been a lot of good/interesting information in this thread, and I wanted to refocus :)

from graphblas.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.