Comments (8)
Mr. @jameshawkes and Mrs. @mathleur , this is very promissing and I will try and run some benchmarks on a 4Tb NWP Model Outputs TileDB Array. Is there any special querries you would suggest to stress test Polytope?
from polytope.
Hi @itsgifnotjiff ,
@mathleur has made a minimal example showing use of tileDB though the xarray engine:
https://gist.github.com/mathleur/7314e497d4fdd0224a3a3a8d92ba58d2
Would be good to understand if there are disadvantages to this xarray engine.
from polytope.
Hi @itsgifnotjiff and @jameshawkes, happy to help answer any questions regarding TileDB. You may see some decreases in performance compared to accessing TileDB directly since there are a few more steps going through the xarray backend but it should be completely doable workflow wise. The example from @mathleur looks good.
Yordan if you're using some of the datasets you've already ingested into TileDB it would just be using everything after opening the dataset in xarray (line 44 onward in the example). If you do notice that performance is outside what you'd find acceptable we're happy to help investigate to figure out where the bottleneck might be.
from polytope.
Hi @itsgifnotjiff, the main point for Polytope shapes is to define shapes on all dimensions of the datacube. In your second datacube, this would be forecast, lat and lon. So to define:
-
a flight path:
Request( Path([“lat”, “lon”], Box[“lat”, “lon”], [0,0], [1,1]), flight_points), Select(“forecast”, [0]))
(Note that this will be static in time and altitude because of the provided datacube, but when there are time and altitude dimensions available, a flight path should be a 4D path, which is a sweep of a 4D box along 4D flight points) -
Canada:
Request(Polygon([“lat”, “lon”], canada_points), Select(“forecast”, [0]))
-
a bounding box:
Request(Box([“lat”, “lon”], [0,0], [1,1]), Select(“forecast”, [0]))
-
a point:
Request(Point([“lat”, “lon”], [[0,0]], method=“surrounding”), Select(“forecast”, [0))
(Note that this will take all the surrounding points to the requested point available on the datacube) -
a vertical profile: This is difficult to achieve with the provided datacube, but it would probably be a
Point
in the lat/lon dimensions “augmented” by aSelect
in the “forecast” dimension (like in all the other examples above) and aSpan
in an altitude dimension.
Note also that the datacube needs to be defined as an Xarray DataArray instead of a Dataset currently.
Finally, to solve the shown ValueError, we've updated the develop branch, so it should now work! For any similar ValueErrors, each type of datacube dimension value needs to have an associated DatacubeAxis defined in the _type_to_axis_lookup
dictionary (at the bottom of the /polytope/datacube/datacube_axis.py file). In this case, for example, the type np.float32 had be added as a FloatDatacubeAxis().
I hope this helps but I would be happy to answer more questions and will update the documentation to make these steps clearer!
from polytope.
Hi @itsgifnotjiff, thanks for your feature request! I'm not familiar with TileDB, @mathleur will check out the docs and see if its doable.
We already have an xarray backend for polytope, is it possible to use xarray as a layer to read from a TileDB? I've seen this https://github.com/TileDB-Inc/TileDB-CF-Py but no experience with it.
from polytope.
Thank you for your work and consideration Mr. @jameshawkes.
Yes they have integrated with xarray by specifying xr.open_dataset(uri, engine="tiledb")
. But I am really curious to see if the query can not be pushed down to their C++ backend itself.
I think Mrs. Julia Dark @jp-dark , Mr. Isaiah Norton @ihnorton or Mr. Nick Kules can confirm if it is doable.
from polytope.
I would start simple with some point-profiles (e.g. time-series, vertical profiles) and some area extractions (e.g. boxes, polygons). We know that the xarray backend is not highly optimised, so I expect we will see some scaling issues at first, but we are actively working on this.
from polytope.
@jameshawkes and @mathleur sorry to bother you but i have a TileDB Array with a couple of TB of data in RottatedPole that looks like the image below and I am trying to construct some querries but understandably Polytope documentation does not give a lot of examples of how to build different queries.
If it is not too much to ask can you help me build a
- flight path
- Canada
- bounding box
- point,
- vertical profile
and any other query you think would be pertinent to test please?
or even better let's use this cube
slicer = HullSlicer()
polytope_API = Polytope(datacube=ds.TT, engine=slicer)
canada_bbox = [-141, -50, 39, 83],
box1 = Box(["lon", "lat"], [-141, -50], [39, 83])
request = Request(box1)
result = polytope_API.retrieve(request)
print([leaf.result for leaf in result.leaves])
but I get
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[8], line 2
1 slicer = HullSlicer()
----> 2 polytope_API = Polytope(datacube=ds.TT, engine=slicer)
4 canada_bbox = [-141, -50, 39, 83],
6 box1 = Box(["lon", "lat"], [-141, -50], [39, 83])
File ~/.conda/envs/polytope_exp/lib/python3.11/site-packages/polytope/polytope.py:49, in Polytope.__init__(self, datacube, engine, axis_options, datacube_options)
46 if datacube_options is None:
47 datacube_options = {}
---> 49 self.datacube = Datacube.create(datacube, axis_options)
50 self.engine = engine if engine is not None else Engine.default()
File ~/.conda/envs/polytope_exp/lib/python3.11/site-packages/polytope/datacube/backends/datacube.py:156, in Datacube.create(datacube, axis_options, datacube_options)
153 if isinstance(datacube, (xr.core.dataarray.DataArray, xr.core.dataset.Dataset)):
154 from .xarray import XArrayDatacube
--> 156 xadatacube = XArrayDatacube(datacube, axis_options, datacube_options)
157 return xadatacube
158 else:
File ~/.conda/envs/polytope_exp/lib/python3.11/site-packages/polytope/datacube/backends/xarray.py:37, in XArrayDatacube.__init__(self, dataarray, axis_options, datacube_options)
35 if self.dataarray[name].dims == ():
36 options = axis_options.get(name, None)
---> 37 self._check_and_add_axes(options, name, values)
38 treated_axes.append(name)
39 for name in dataarray.dims:
File ~/.conda/envs/polytope_exp/lib/python3.11/site-packages/polytope/datacube/backends/datacube.py:76, in Datacube._check_and_add_axes(self, options, name, values)
74 DatacubeAxis.create_standard(name, values, self)
75 elif name not in self._axes.keys():
---> 76 DatacubeAxis.create_standard(name, values, self)
File ~/.conda/envs/polytope_exp/lib/python3.11/site-packages/polytope/datacube/datacube_axis.py:108, in DatacubeAxis.create_standard(name, values, datacube)
105 @staticmethod
106 def create_standard(name, values, datacube):
107 values = np.array(values)
--> 108 DatacubeAxis.check_axis_type(name, values)
109 if datacube._axes is None:
110 datacube._axes = {name: deepcopy(_type_to_axis_lookup[values.dtype.type])}
File ~/.conda/envs/polytope_exp/lib/python3.11/site-packages/polytope/datacube/datacube_axis.py:120, in DatacubeAxis.check_axis_type(name, values)
116 @staticmethod
117 def check_axis_type(name, values):
118 # NOTE: The values here need to be a numpy array which has a dtype attribute
119 if values.dtype.type not in _type_to_axis_lookup:
--> 120 raise ValueError(f"Could not create a mapper for index type {values.dtype.type} for axis {name}")
ValueError: Could not create a mapper for index type <class 'numpy.float32'> for axis pres
from polytope.
Related Issues (20)
- No module named 'pygribjump' HOT 18
- Use earth kit in examples to load data
- In xarray get_indices, at beginning, should assert that the label == next axis
- in span, init, sanity check points coming in, shouldn't be arrays of points
- make CI badge work
- Can this be used to optimize the extraction of grid point timeseries from grib files? HOT 5
- fix datacube create method to switch for octahedral grid and normal grid
- Handle cyclic remapping inside of Datacube, not request tree
- Merge transformation for xarray
- Common transformation class for cyclic, mapper and other transformations
- Reverse transformation HOT 1
- New transformation options
- Support cyclic+reverse together
- Create cyclic transformation axis
- Return existing indices in the datacube for cyclic transformation
- Add surrounding points option to the Select shape
- Create healpix grid mapper
- Add gribjump to CI workflow
- Modify CI to work with eccodes and gdal
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from polytope.