Git Product home page Git Product logo

datacube's People

Contributors

cholmes avatar lossyrob avatar m-mohr avatar matthewhanson avatar philvarner avatar richardscottoz avatar tomaugspurger avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

datacube's Issues

Specify variable shape / chunking?

Storage formats like Zarr and NetCDF allow variables to be chunked along one or more dimensions (zarr docs).

Would this extension be an appropriate place to capture that information? Building on #6:

{
  "cube:variables": {
    "prcp": {
      "type": "data",
      "description": "...",
      "extent": [0, null],
      "unit": "mm",
      "dimensions": ["time", "y", "x"],
      // new chunking information below //
      "shape": [365, 1000, 1000],
      "chunks": [1, 100, 100]
    }
}

The expectation is that if chunks is specified, it's an array of integers equal to the number of dimensions. The integers indicate the chunk size. Zarr, at least, doesn't allow non-uniform chunks, so a simple list of integers is used. If non-uniform chunking is desired, you would need a list of list of integers.

This overlaps somewhat with proj:shape, but I'm not really sure how extensions interact with each other.

Order of the values

You can specify "values" (i.e. dimension labels) in each dimension. The spec describes it in various ways, but always includes

"a set of all potential values"

Recently, we realized this may imply to people that the values are unordered: Open-EO/openeo-python-client#277 (comment)

This is unintentional, at least from my POV as the original author of this extension. This should probably be something like:

an ordered list of all values

Variables

Origin: radiantearth/stac-spec#713

Two things came up recently that could be integrated into the data cube extension:

  1. Add variables in addition to dimensions. Some data cubes expose variables, some don't. We don't need this for openEO (yet?), but Google Earth Engine @simonff would probably use them. I'm also looking at netCDF and other formats, which afaik support variables as addition to dimensions. Maybe there's space for alignment also with the ESM collection spec "fork" from @rabernat.
  2. For dimensions it might be useful to specify the number of cells (see openEO UDF discussions).

Information about values

Allow a way to provide information about the data cube values, right now this extension is focused on dimensions, variables and labels.
Should probably be aligned with raster:bands. (We can't use raster:bands as not all datacubes may have bands.)

Potential fields:

  • Data type
  • Statistics
  • ...

usage and maturity

At ITS_LIVE we have a list of Zarr data cubes (~3k items) and we'd like to generate a static STAC catalog, preferably using this extension. I'm a bit confused on the semantics of what a data cube means at the collection level, my idea is to have 1 top level collection that describes the dataset and each of the ~3000 zarr cubes are items of the collection that use the datacube extension to describe the dimensions, variables etc. I've only seen 1 cube per collection, so I'm not sure if I'm thinking this wrong. Any advice is welcome.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.