Making a issue for posterity after showing this to Jim in the hallway. This is in

Tuples of lists of indices does not work as in numpy about awkward-0.x HOT 7 CLOSED

lgray commented on May 23, 2024

Tuples of lists of indices does not work as in numpy

from awkward-0.x.

Comments (7)

jpivarski commented on May 23, 2024

I put this fix in a branch where I'm still developing the deep reducers, but that ought to be finished tomorrow. (They seem to be working already; I just need to generalize it to all awkward types, which involves introducing a concept of "deepest jagged" that I'll use to solve #56, so two-in-one.)

Check out how minimal the patch is. It's a typo and an isolated incident: there shouldn't be any references to self in this block of code; it's where we're chaining operations onto the output, where everything should happen on node. It was testing code that snuck into the implementation.

Here's your output with the fix:

>>> import awkward
>>> test = awkward.JaggedArray.fromcounts([1, 2, 1], [1, 2, 3, 4])'
>>> test
<JaggedArray [[1] [2 3] [4]] at 0x7f69164eef28>
>>> test[1, 0]
2
>>> test[[1], [0]]
array([[2]])
>>> test[[1, 0], [0, 0]]
array([[2, 2],
       [1, 1]])

from awkward-0.x.

lgray commented on May 23, 2024

Hi Jim, shouldn't the output be an array of dimension one?

from awkward-0.x.

jpivarski commented on May 23, 2024

Yikes! What is Numpy doing? I thought I understood all of its behaviors, but it's stepping out of line with the normal behavior of array[sel1, sel2] means "sel2 is a next-dimensional cross-section of what sel1 picked out.

For example,

>>> temp = numpy.array([[1, 2], [3, 4], [5, 6]])
>>> temp[:2, [0, 1]]
array([[1, 2],
       [3, 4]])

:2 picks the first two outer lists, [1, 2] and [3, 4], and then [0, 1] picks the zeroth and first elements from those. Thus,

>>> temp = awkward.JaggedArray.fromiter([[1, 2], [3, 4], [5, 6]])
>>> temp[:2, [0, 1]]
array([[1, 2],
       [3, 4]])
>>> temp[[0, 1], [0, 1]]
array([[1, 2],
       [3, 4]])

awkward-array does the same thing, regardless of whether the first selector is :2 or [0, 1].

But then Numpy does this???

>>> temp = numpy.array([[1, 2], [3, 4], [5, 6]])
>>> temp[[0, 1], [0, 1]]
array([1, 4])

I thought I had seen everything, but I guess not. I suppose the logic of this is that the tuple represents a pair of coordinate arrays...

>>> temp[[0, 1, 2], [0, 1]]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (3,) (2,)

Apparently. As far as I can tell, this is breaking the pattern established with the temp[:2, [0, 1]] example, but I see the use-case: you want all the elements that satisfy (i=0 and j=0) or (i=1 and j=1) (two elements), whereas awkward-array gives all the elements that satisfy (i=0 or j=0) or (i=1 or j=1) (four elements). But only if everything in the tuple is a sequence of integers, not if some are slices.

It would be hard to retrofit awkward-array to implement this exception to the rule. It's a recursive implementation in which each part only knows its own level of depth, then passes on the baton. Not necessarily impossible, but hard to think about.

Also, I think this Numpy behavior might only work for regular arrays, not jagged arrays (in general). I'll have to think about that. If so, then this could justify the different behavior. (Another example: the axis parameter is incompatible with jagged arrays.)

from awkward-0.x.

jpivarski commented on May 23, 2024

Thinking about it while walking, yes, it can be retrofitted. It's just a special case exception to the usual behavior, and will require more special case code.

I'll need to pass down some kind of a marker to indicate that we're in this case.

from awkward-0.x.

jpivarski commented on May 23, 2024

You know, if you have a jagged array that happens to be regular (all dimensions have constant size), you can convert it to a regular numpy array with array.regular().

from awkward-0.x.

lgray commented on May 23, 2024

Unfortunately I am not so lucky :-(

Fortunately I can work around this by instead making a list that is valargs = jagged.offsets[dim1args] + dim2args.

Where dim{1,2}args are the indices resulting from a search.

However, I think this is not the interface you would like to present to users!

from awkward-0.x.

jpivarski commented on May 23, 2024

We now have tests like the following to test many combinations of extraction, slicing, masking, and gathering (fancy indexing).

np = numpy.array([[1, 10, 100], [2, 20, 200], [3, 30, 300]])
aw = awkward.fromiter(np)

assert np.tolist() == aw.tolist()
assert np[:2].tolist() == aw[:2].tolist()
assert np[:2, :2].tolist() == aw[:2, :2].tolist()
assert np[:2, 2].tolist() == aw[:2, 2].tolist()
assert np[2, :2].tolist() == aw[2, :2].tolist()
assert np[:2, [0, 1]].tolist() == aw[:2, [0, 1]].tolist()
assert np[[0, 1], :2].tolist() == aw[[0, 1], :2].tolist()
assert np[:2, [0, 1, 2]].tolist() == aw[:2, [0, 1, 2]].tolist()
assert np[[0, 1, 2], :2].tolist() == aw[[0, 1, 2], :2].tolist()
assert np[[0, 1], [0, 1]].tolist() == aw[[0, 1], [0, 1]].tolist()
assert np[[0, 1, 2], [0, 1, 2]].tolist() == aw[[0, 1, 2], [0, 1, 2]].tolist()
assert np[:2, [True, False, True]].tolist() == aw[:2, [True, False, True]].tolist()
assert np[[True, False, True], :2].tolist() == aw[[True, False, True], :2].tolist()
assert np[[True, False, True], [True, False, True]].tolist() == aw[[True, False, True], [True, False, True]].tolist()

For this to work, JaggedArray.__getitem__ is 250 lines long! When I thought the rules we were targeting were simpler, I was more optimistic about implementing them all in complete generality in Numpy only. Now I'm beginning to think that the future compiled version of this wouldn't be just for a speedup, but also get get better parity with the original; maybe organize the code to mirror Numpy's.

This is still in the "feature-particle-matching-example" branch.

from awkward-0.x.

Tuples of lists of indices does not work as in numpy about awkward-0.x HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent