Comments (7)
I put this fix in a branch where I'm still developing the deep reducers, but that ought to be finished tomorrow. (They seem to be working already; I just need to generalize it to all awkward types, which involves introducing a concept of "deepest jagged" that I'll use to solve #56, so two-in-one.)
Check out how minimal the patch is. It's a typo and an isolated incident: there shouldn't be any references to self
in this block of code; it's where we're chaining operations onto the output, where everything should happen on node
. It was testing code that snuck into the implementation.
Here's your output with the fix:
>>> import awkward
>>> test = awkward.JaggedArray.fromcounts([1, 2, 1], [1, 2, 3, 4])'
>>> test
<JaggedArray [[1] [2 3] [4]] at 0x7f69164eef28>
>>> test[1, 0]
2
>>> test[[1], [0]]
array([[2]])
>>> test[[1, 0], [0, 0]]
array([[2, 2],
[1, 1]])
from awkward-0.x.
Hi Jim, shouldn't the output be an array of dimension one?
from awkward-0.x.
Yikes! What is Numpy doing? I thought I understood all of its behaviors, but it's stepping out of line with the normal behavior of array[sel1, sel2]
means "sel2
is a next-dimensional cross-section of what sel1
picked out.
For example,
>>> temp = numpy.array([[1, 2], [3, 4], [5, 6]])
>>> temp[:2, [0, 1]]
array([[1, 2],
[3, 4]])
:2
picks the first two outer lists, [1, 2]
and [3, 4]
, and then [0, 1]
picks the zeroth and first elements from those. Thus,
>>> temp = awkward.JaggedArray.fromiter([[1, 2], [3, 4], [5, 6]])
>>> temp[:2, [0, 1]]
array([[1, 2],
[3, 4]])
>>> temp[[0, 1], [0, 1]]
array([[1, 2],
[3, 4]])
awkward-array does the same thing, regardless of whether the first selector is :2
or [0, 1]
.
But then Numpy does this???
>>> temp = numpy.array([[1, 2], [3, 4], [5, 6]])
>>> temp[[0, 1], [0, 1]]
array([1, 4])
I thought I had seen everything, but I guess not. I suppose the logic of this is that the tuple represents a pair of coordinate arrays...
>>> temp[[0, 1, 2], [0, 1]]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (3,) (2,)
Apparently. As far as I can tell, this is breaking the pattern established with the temp[:2, [0, 1]]
example, but I see the use-case: you want all the elements that satisfy (i=0 and j=0) or (i=1 and j=1)
(two elements), whereas awkward-array gives all the elements that satisfy (i=0 or j=0) or (i=1 or j=1)
(four elements). But only if everything in the tuple is a sequence of integers, not if some are slices.
It would be hard to retrofit awkward-array to implement this exception to the rule. It's a recursive implementation in which each part only knows its own level of depth, then passes on the baton. Not necessarily impossible, but hard to think about.
Also, I think this Numpy behavior might only work for regular arrays, not jagged arrays (in general). I'll have to think about that. If so, then this could justify the different behavior. (Another example: the axis
parameter is incompatible with jagged arrays.)
from awkward-0.x.
Thinking about it while walking, yes, it can be retrofitted. It's just a special case exception to the usual behavior, and will require more special case code.
I'll need to pass down some kind of a marker to indicate that we're in this case.
from awkward-0.x.
You know, if you have a jagged array that happens to be regular (all dimensions have constant size), you can convert it to a regular numpy array with array.regular()
.
from awkward-0.x.
Unfortunately I am not so lucky :-(
Fortunately I can work around this by instead making a list that is valargs = jagged.offsets[dim1args] + dim2args.
Where dim{1,2}args are the indices resulting from a search.
However, I think this is not the interface you would like to present to users!
from awkward-0.x.
We now have tests like the following to test many combinations of extraction, slicing, masking, and gathering (fancy indexing).
np = numpy.array([[1, 10, 100], [2, 20, 200], [3, 30, 300]])
aw = awkward.fromiter(np)
assert np.tolist() == aw.tolist()
assert np[:2].tolist() == aw[:2].tolist()
assert np[:2, :2].tolist() == aw[:2, :2].tolist()
assert np[:2, 2].tolist() == aw[:2, 2].tolist()
assert np[2, :2].tolist() == aw[2, :2].tolist()
assert np[:2, [0, 1]].tolist() == aw[:2, [0, 1]].tolist()
assert np[[0, 1], :2].tolist() == aw[[0, 1], :2].tolist()
assert np[:2, [0, 1, 2]].tolist() == aw[:2, [0, 1, 2]].tolist()
assert np[[0, 1, 2], :2].tolist() == aw[[0, 1, 2], :2].tolist()
assert np[[0, 1], [0, 1]].tolist() == aw[[0, 1], [0, 1]].tolist()
assert np[[0, 1, 2], [0, 1, 2]].tolist() == aw[[0, 1, 2], [0, 1, 2]].tolist()
assert np[:2, [True, False, True]].tolist() == aw[:2, [True, False, True]].tolist()
assert np[[True, False, True], :2].tolist() == aw[[True, False, True], :2].tolist()
assert np[[True, False, True], [True, False, True]].tolist() == aw[[True, False, True], [True, False, True]].tolist()
For this to work, JaggedArray.__getitem__
is 250 lines long! When I thought the rules we were targeting were simpler, I was more optimistic about implementing them all in complete generality in Numpy only. Now I'm beginning to think that the future compiled version of this wouldn't be just for a speedup, but also get get better parity with the original; maybe organize the code to mirror Numpy's.
This is still in the "feature-particle-matching-example" branch.
from awkward-0.x.
Related Issues (20)
- dynamically created methods are confusing for users HOT 1
- Achieve masking HOT 8
- AssertionError when Table is part of a list HOT 5
- Potential bug with subsequent masking HOT 2
- Reduction of empty elements HOT 2
- TLorentzVectorArray yields different values depending on masking order HOT 4
- IndexError when masking empty jaggedArray made from offsets HOT 1
- awkward method names HOT 9
- TypeError when using array.mean(weights) HOT 2
- Cyclic array? HOT 1
- broken link in readme HOT 1
- Installing awkward-numba in usermode breaks awkward HOT 3
- Syntax warning due to comparison of literals using is in Python 3.8 HOT 1
- Inconsistent Filesizes with .awkd Files HOT 6
- Bug in string comparison in StringArray HOT 1
- mean, std fail on ChunkedArrays HOT 1
- AttributeError when trying to read a particular format of awkward array HOT 5
- JaggedArray.fromiter() functions fails for python lists HOT 2
- Small detail; broadcasting seems to work a little different to what is implied in the documentation. HOT 6
- Accumulate numpy arrays inside the loop HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from awkward-0.x.