Git Product home page Git Product logo

Comments (7)

jpivarski avatar jpivarski commented on May 23, 2024

I put this fix in a branch where I'm still developing the deep reducers, but that ought to be finished tomorrow. (They seem to be working already; I just need to generalize it to all awkward types, which involves introducing a concept of "deepest jagged" that I'll use to solve #56, so two-in-one.)

Check out how minimal the patch is. It's a typo and an isolated incident: there shouldn't be any references to self in this block of code; it's where we're chaining operations onto the output, where everything should happen on node. It was testing code that snuck into the implementation.

Here's your output with the fix:

>>> import awkward
>>> test = awkward.JaggedArray.fromcounts([1, 2, 1], [1, 2, 3, 4])'
>>> test
<JaggedArray [[1] [2 3] [4]] at 0x7f69164eef28>
>>> test[1, 0]
2
>>> test[[1], [0]]
array([[2]])
>>> test[[1, 0], [0, 0]]
array([[2, 2],
       [1, 1]])

from awkward-0.x.

lgray avatar lgray commented on May 23, 2024

Hi Jim, shouldn't the output be an array of dimension one?

from awkward-0.x.

jpivarski avatar jpivarski commented on May 23, 2024

Yikes! What is Numpy doing? I thought I understood all of its behaviors, but it's stepping out of line with the normal behavior of array[sel1, sel2] means "sel2 is a next-dimensional cross-section of what sel1 picked out.

For example,

>>> temp = numpy.array([[1, 2], [3, 4], [5, 6]])
>>> temp[:2, [0, 1]]
array([[1, 2],
       [3, 4]])

:2 picks the first two outer lists, [1, 2] and [3, 4], and then [0, 1] picks the zeroth and first elements from those. Thus,

>>> temp = awkward.JaggedArray.fromiter([[1, 2], [3, 4], [5, 6]])
>>> temp[:2, [0, 1]]
array([[1, 2],
       [3, 4]])
>>> temp[[0, 1], [0, 1]]
array([[1, 2],
       [3, 4]])

awkward-array does the same thing, regardless of whether the first selector is :2 or [0, 1].

But then Numpy does this???

>>> temp = numpy.array([[1, 2], [3, 4], [5, 6]])
>>> temp[[0, 1], [0, 1]]
array([1, 4])

I thought I had seen everything, but I guess not. I suppose the logic of this is that the tuple represents a pair of coordinate arrays...

>>> temp[[0, 1, 2], [0, 1]]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (3,) (2,) 

Apparently. As far as I can tell, this is breaking the pattern established with the temp[:2, [0, 1]] example, but I see the use-case: you want all the elements that satisfy (i=0 and j=0) or (i=1 and j=1) (two elements), whereas awkward-array gives all the elements that satisfy (i=0 or j=0) or (i=1 or j=1) (four elements). But only if everything in the tuple is a sequence of integers, not if some are slices.

It would be hard to retrofit awkward-array to implement this exception to the rule. It's a recursive implementation in which each part only knows its own level of depth, then passes on the baton. Not necessarily impossible, but hard to think about.

Also, I think this Numpy behavior might only work for regular arrays, not jagged arrays (in general). I'll have to think about that. If so, then this could justify the different behavior. (Another example: the axis parameter is incompatible with jagged arrays.)

from awkward-0.x.

jpivarski avatar jpivarski commented on May 23, 2024

Thinking about it while walking, yes, it can be retrofitted. It's just a special case exception to the usual behavior, and will require more special case code.

I'll need to pass down some kind of a marker to indicate that we're in this case.

from awkward-0.x.

jpivarski avatar jpivarski commented on May 23, 2024

You know, if you have a jagged array that happens to be regular (all dimensions have constant size), you can convert it to a regular numpy array with array.regular().

from awkward-0.x.

lgray avatar lgray commented on May 23, 2024

Unfortunately I am not so lucky :-(

Fortunately I can work around this by instead making a list that is valargs = jagged.offsets[dim1args] + dim2args.

Where dim{1,2}args are the indices resulting from a search.

However, I think this is not the interface you would like to present to users!

from awkward-0.x.

jpivarski avatar jpivarski commented on May 23, 2024

We now have tests like the following to test many combinations of extraction, slicing, masking, and gathering (fancy indexing).

np = numpy.array([[1, 10, 100], [2, 20, 200], [3, 30, 300]])
aw = awkward.fromiter(np)

assert np.tolist() == aw.tolist()
assert np[:2].tolist() == aw[:2].tolist()
assert np[:2, :2].tolist() == aw[:2, :2].tolist()
assert np[:2, 2].tolist() == aw[:2, 2].tolist()
assert np[2, :2].tolist() == aw[2, :2].tolist()
assert np[:2, [0, 1]].tolist() == aw[:2, [0, 1]].tolist()
assert np[[0, 1], :2].tolist() == aw[[0, 1], :2].tolist()
assert np[:2, [0, 1, 2]].tolist() == aw[:2, [0, 1, 2]].tolist()
assert np[[0, 1, 2], :2].tolist() == aw[[0, 1, 2], :2].tolist()
assert np[[0, 1], [0, 1]].tolist() == aw[[0, 1], [0, 1]].tolist()
assert np[[0, 1, 2], [0, 1, 2]].tolist() == aw[[0, 1, 2], [0, 1, 2]].tolist()
assert np[:2, [True, False, True]].tolist() == aw[:2, [True, False, True]].tolist()
assert np[[True, False, True], :2].tolist() == aw[[True, False, True], :2].tolist()
assert np[[True, False, True], [True, False, True]].tolist() == aw[[True, False, True], [True, False, True]].tolist()

For this to work, JaggedArray.__getitem__ is 250 lines long! When I thought the rules we were targeting were simpler, I was more optimistic about implementing them all in complete generality in Numpy only. Now I'm beginning to think that the future compiled version of this wouldn't be just for a speedup, but also get get better parity with the original; maybe organize the code to mirror Numpy's.

This is still in the "feature-particle-matching-example" branch.

from awkward-0.x.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.