This is a bit of a catch-all issue for work relating to the requirements associated with the NetCDF output from GeoFabrics (and also input to BGFLOOD).
Requirements:
- Store each geofabric dataset in one NetCDF file with each layer as different variables.
- Could also do multiple resolutions with each elevation/roughness set in a different group
- Could also include forcing information in the same netCDF file if we want all inputs in a single file
- Include
spatial_ref
information like geotransform and crs - see section below.
- Done using rioxarray and
write_crs()
and write_transform
- Include appropriate
variable
and coordinates
attributes:
units
information: See link for conventions.
long_name
: A description of the variable/coordiante.
standard_name
: Must be included in the NUG table.
vertical_datum
: Non-standard attribute name to record the vertical datum of any elevation data.
- Record the parameters of a run:
- Data sources should be documented in the attributes (i.e. land layer x, revision y)
- Capture the parameter information in a variable, or perhaps a json dump into a group attribute.
- Different resolution geofabrics should be aligned and evenly divisible (to ensure that alignment)
DEMs and roughness should be defined over a full rectangular grid with no NaN values NaNs have since been deemed ok!
NetCDF conventions
There are standards defining the conventions for attributes in netCDF files.
The spatial_ref
coordinate
This is where information associated with the coordinate system and projection (CRS and transform) are encoded.
Coordinate systems CF-1.6 <--> CRS
There is an optional grid mapping attribute called crs_wkt
may be used to specify multiple coordinate system properties in so-called well-known text format (usually abbreviated to CRS WKT or OGC WKT) as detailed in the cfconventions.org page. With example mappings at the github page.
It looks like this information is sometimes encoded within a "spatial_ref" coordinate (see issue).
Python Libraries
There are various Python libraries for interacting with NetCDF files including netCDF4, xarray, and rioxarray. netCDF4 is an engine used by xarray to read and write netCDF files. xarray has some power constructs for constructing and interacting with data stored in netCDF files. rioxarray combines xarray with rasterio by providing access to the rasterio engine with the rio accessor.
xarray supports two main objects - DataArrays and DataSets. DataArrays work well for a single layer of data (possibly across many bands), and the DataSet class should be used for multiple variables that may have different dimension (i.e. different resolutions, or x,y vs time).
Other
There is potential for a translation layer between GeoFabrics and BG-FLOOD or also between the catchment generation code and either GeoFabrics and/or BG-FLOOD. This would be contained in a separate repository to either.