lewisacidic / jagged-array Goto Github PK
View Code? Open in Web Editor NEWMultidimensional [jr]agged array support for the PyData ecosysem
License: MIT License
Multidimensional [jr]agged array support for the PyData ecosysem
License: MIT License
Repr will be broken when we address #14 as the shapes will get quite long vertically.
We should probably realign the repr to look like pydata/sparse for consistency.
pytest
should be runnable with python setup.py test
The array of arrays jagged format is more properly referred to as an Illife vector.
We could implement this as a new jagged class?
We should have pre-commit hooks set up
Nice example: pytest
Emulate numpy in not printing the dtype if its the same as the base dtypes.
e.g.
>>> np.array([1, 2])
array([1, 2])
>>> np.array([1., 2.])
array([1., 2.])
>>> np.array([1., 2.], np.float32)
array([1., 2.], dtype=float32)
i.e. we should do
>>> JaggedArray([1, 2, 3], [[1, 2]])
JaggedArray(data=[1 2 3],
shape=[[1 2]])
Implement __getstate__
and __setstate__
for easy pickling.
We should have an operation to make a jagged dimension smoothe. If all jagged axes are smoothed, we get a numpy array back.
>>> ja = JaggedArray([[0, 1, 2], [3, 4], [5, 6, 7]]); ja
JaggedArray([[0, 1, 2],
[3, 4],
[5, 6, 7]])
>>> ja.smoothe()
array([[0, 1],
[3, 4],
[5, 6]])
We should have some factory methods, such as
array
asarray
asfortranarray
ascontiguousarray
numpy
uses memoryview
as a data attribute. As we are trying to emulate the API as much as possible, we should probably use that too.
We can use np.ndarray(buffer=self.data, ...)
to access the data on indexing.
The 1D data would still be exposable using ravel
.
>>> JaggedArray([1, 2, 3], (3, (2, 1)), dtype='i4')
JaggedArray([[1, 2],
[3]], dtype=int32)
jagged_to_masked
needs data to be made contiguous.
We should have empty
and empty_like
jagged equivalents.
Look into using versioneer
>>> JaggedArray([[[0, 1],
[2, 3]],
[[4, 5]],
[[6, 7],
[8, 9]]]).reshape(-1, 2)
JaggedArray([[0, 1],
[2, 3],
[4, 5],
[6, 7],
[8, 9]])
We should use jagged.array
for these types of operations.
We should have an allclose array method in api
Could implement this by playing with strides.
We can currently slice only in the first dimension,
i.e.
>>> ja = jagged.JaggedArray(np.arange(22), (3, (3, 2, 3), (3, 2, 3))
>>> ja[0] # simple indexing
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
>>> ja[:2] # slicing
JaggedArray([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10],
[11, 12]]])
>>> ja[::2] # with a step
JaggedArray([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[13, 14, 15],
[16, 17, 18],
[19, 20, 21]]])
We can't slice in other directions. While this may less useful for jagged arrays, it should be supportable, i.e.
>>> ja[0, 0, 0]
0
>>> ja[0, 0]
array([0, 1, 2])
>>> ja[0, :, 0]
array([0, 3, 6])
>>>ja[:, 0]
JaggedArray([[ 0, 1, 2],
[ 9, 10],
[13, 14, 15]])
>>> ja[:, 0, 0]
array([0, 9, 13])
We should also support advanced indexing:
>>> ja[[0, 2]]
JaggedArray([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[13, 14, 15],
[16, 17, 18],
[19, 20, 21]]])
>>> ja[:, [0, 1]]
JaggedArray([[[ 0, 1, 2],
[ 3, 4, 5]],
[[ 9, 10],
[11, 12]],
[[13, 14, 15],
[16, 17, 18]]])
>>> ja[:, [0, 1], [0, 1]]
JaggedArray([[ 0, 4],
[ 9, 12],
[13, 17]])
>>> ja[:, [0, 2], [0, 2]]
JaggedArray([[ 0, 8],
[ 9],
[13, 21]])
Finally, allow np.newaxis
(or None
), ...
etc.
>>> ja[..., 0, 0]
array([0, 9, 13])
>>> ja[:, np.newaxis]
JaggedArray([[[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]]],
[[[ 9, 10],
[11, 12]]],
[[[13, 14, 15],
[16, 17, 18],
[19, 20, 21]]]])
)
We should implement strides for better slicing support.
These can be a collection over subarrays much as shape / shapes are.
Add IO for
Currently, the shape array is oriented so that axis 0 relates to the rest of the dimensions, and axis 1 relates to the first dimension. Why is only known to my thesis-addled brain., as it is not intuitive, especially when slicing it:
>>> ja = JaggedArray(np.arange(22), [[3, 2, 3], [3, 2, 3]])
>>> ja[0]
array([[ 0, 1, 2],
[3, 4, 5],
[6 ,7, 8]])
>>> ja.shape[0] # intuitively, the shape of `ja[0]`, but
array([3, 2, 3])
>>> ja.shape[:, 0]
array([3, 3])
This of course will require quite the refactor, probably better a full rewrite.
jagged.random((5, 5))
will always have maximum limits (5, 4).
Write a good readme, with
I don't think typing was even a thing when the initial code from scikit-chem was written. It should be relatively easy to add types in now.
Add an nbytes property
One dimensional jagged arrays (while being a bit pointless...) are possible, but currently don't print nicely:
>>> JaggedArray([1, 2, 3], (3,))
JaggedArray([1, 2, 3])
Jagged-like objects used as arguments to jagged functions such as concatenate etc. should be converted, just as numpy does.
Code was written using pylint at some point, but is likely outdated. Would be nice to have autoformatting and linting to ensure code quality.
Would be good to allow the use of dense arrays in concatenate etc.
This would be trivial to implement simply by converting the dense array to a jagged array, although there could be optimisations made to avoid that overhead.
We should allow JaggedArray([[1, 2],[3]])
to work
we should have a continous integration server running.
Probably CircleCI
Currently, you pass a shape to reshape, ie .
>>> JaggedArray([[0, 1], [2], [3, 4]]).reshape((3, (1, 2, 2)))
JaggedArray([[0],
[1, 2],
[3, 4]])
Instead, we should emulate numpy, where the shape are the arguments:
>>> np.arange(25).reshape(5, 5)
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
i.e.
>>> JaggedArray([[0, 1], [2], [3, 4]]).reshape(3, (1, 2, 2))
JaggedArray([[0],
[1, 2],
[3, 4]])
We could also allow passing shapes, like
>>> JaggedArray([[0, 1], [2], [3, 4]]).reshape(shapes=[[1], [2], [2]])
JaggedArray([[0],
[1, 2],
[3, 4]])
We should measure coverage, using pytest-cov
.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.