envmodellinggroup / hrds Goto Github PK

View Code? Open in Web Editor NEW

7.0 2.0 3.0 25.53 MB

Hierarchical raster data set: smooth interpolation of raster files at different resolutions for multiscale modelling

License: GNU General Public License v3.0

Makefile 0.29% Python 89.33% QML 8.47% GLSL 1.91%

modelling science-research numerical-modelling gis

hrds's People

Contributors

Stargazers

Watchers

Forkers

pythonchb tomc271 swarder

hrds's Issues

Buffer-data mismatch

By choosing a dx that resolves the buffer, we run into the problem that the buffer and original data no longer match (see image). This leads to two problems - where the buffer is larger than the data, we don't
smoothly transition data and where the buffer is smaller we don't have any buffer!

The buffer should be generated exactly as the initial raster and interpolation will have to do the rest. This means if we switch from, say, linear to sigmoidal buffers then the user will have to ensure the distance and/or resolution are ok to do this.

point_in does not match get_val

Due to interpolation, point_in() can return true, but interpolation of data in get_val() can require data from the neighbouring cell, which might fall outside the extents.

To solve: add extra code to check within extent - resolution and add a test to check this works correctly.

Installation instructions - `pygdal` version

Submitted as part of a review for JOSS.

The instructions include reference to a specific version of pygdal

Install using pip, the correct version. Note you may need to increase the minor version number, e.g. from 2.1.3 to 2.1.3.3

pip install pygdal==2.1.3

I needed to modify this to pip install pygdal==2.2.3.3. It might be better to state that the version number needs to match the output of gdal-config --version (with the possible addition of a minor version number), rather than pinning it to a specific version that will go stale quickly.

Typos in JOSS paper

Submitted as part of a review for JOSS.

A few minor suggestions for the paper.

Towards the end of the second paragraph there is a comma splice. A semicolon or full stop would be more grammatically traditional.

There are limitations in this approach in that rasters cannot be partially overlapped, all but the base raster must be entirely contained within another raster.

"resolution" should probably get an L. Less of a typo, more of a stylistic point; I'd swap the second "in" for a "when".

This software solves a particular problem when using multiscale numerical models, in that in using high resolution meshes, high resoution data is required, but for spatially limited regions.

"hierarchy" is misspelt. There is also an extraneous "to be used".

By blending a heirachy of data sources, HRDS overcomes this problem and enables multiscale numerical problems to use spatially appropriate data to be used with minimal effort.

Is it "multi-scale" or "multiscale"? Both versions are used.

Create conda recipe

https://conda-forge.org/#add_recipe

Example dataset links

Submitted as part of a review for JOSS.

174ca76 introduces some very helpful description, but also has a few issues. The hyperlinks are missing for GEBCO and EMod and the OceanWise hyperlink is missing the exclamation mark that will make it render properly.

JOSS paper - GDAL citation

Submitted as part of a review for JOSS.

The citation for GDAL doesn't follow the recommendation provided by GDAL.

Add user documentation

Add user documentation beyond README and improve the docstrings throughout.

Installation instructions - supported operating systems

Submitted as part of a review for JOSS.

It's implicit in the sudo apt-get install instruction that the software is intended to be installed and run on a Linux distribution. It may be worth adding a statement about support, or lack thereof, for other systems.

Duplicate tests

There are two test classes called RealDataTest in test_hrds.py

I haven't checked if it's a complete duplicate by mistake, or an error -- but it should be fixed one way or another.

Deal with nodata values

Some datasets have holes or missing data. The buffer should be constructed from the nodata values as well as the edge to facilitate data of this kind. I think it's as simple as setting the nodata cells to 1 before calculating the euclidian distance.

TypeError: 'module'

I am very new in hrds with python, but trying to use hrds to make real bathymetry data.

I followed the instruction for installing hrds using python3 in Ubuntu 20.04 but I got the following error.

Is anybody giving suggestions or comments?
I appreciate you if you give it.

Traceback (most recent call last):
File "exxyz.py", line 11, in
bathy = hrds("gebco_uk.tif",rasters=("emod_utm.tif","inspire_data.tif"),distances=(700, 200))
TypeError: 'module' object is not callable

Error in the raster interpolator

Picked up in the buffer test.

If I ask for point
point4 = [0.899999, 2] #should be 0.5
in the buffer I get:
0.45000066964685909

If I ask for:
point4 = [0.9, 2] #should be 0.5
I get:
0.55000001192092907

They shouldn't be 0.1 away from each other!

The error must be in the rasterInterpolator, so more tests need adding for that.

Allow single raster

It's annoying to have to put in a second dummy raster. Add capability of a single one.

Temp files creation.

In the TestHRDS class, the teardown removes two "buffer" files. But these are not explicitly created in the tests. I haven't dug into the code, but I assume these are automatically created by the HRDS class. Packages really shouldn't create non-explicit temp files like this -- either the user should explicitly specify where they want the temp files to be created, or they should be created in a temp dir (import tempfile), and ideally automatically deleted by the code.

Parallel rasters

Currently each core in a parallel run loads the whole raster. Can cause memory issues with large rasters. Allow specification of a bounds, which can then be used to load in a subset of the raster.

GEBCO licencing

Submitted as part of a review for JOSS.

While signing up to download GEBCO data so that I could actually run hdrs, I was presented with the following licence, which suggests that some GEBCO data probably could be included with hdrs. I realise that there are other reasons not to include data - repo size etc. - but having a couple of tiny example datasets would help users get started.

GEBCO's gridded bathymetric data sets are placed in the public domain and may be used free of charge.

Data within GEBCO's gridded bathymetric data sets are subject to copyright and database rights restrictions. Use
of GEBCO's gridded bathymetric data sets indicates that you accept the terms and conditions of use and
disclaimer information given below.

Under these terms you are free to:

Copy, publish, distribute and transmit the information

Adapt the information

Exploit the information commercially for example, by combining it with other information, or by including it in your own product or application
Terms and conditions of use
You must acknowledge the source of the data. A suitable form of attribution is given in the documentation that accompanies the data set.

Ensure that you do not use the information in a way that suggests any official status or that GEBCO endorses you or your use of the information.

Ensure that you do not mislead others or misrepresent the information or its source.
Disclaimer
GEBCO's gridded bathymetric data sets are made available 'as is'. While every effort has been made to ensure reliability within the limits of present knowledge, the accuracy and completeness of GEBCO's gridded bathymetric data sets cannot be guaranteed. No responsibility can be accepted by those involved in their compilation or publication for any consequential loss, injury or damage arising from their use or for determining the fitness of the data for any particular use.

Users should be aware that GEBCO's gridded bathymetric data sets are largely deep ocean products (based on trackline data from many different sources of varying quality and coverage) and do not include detailed bathymetry for shallow shelf waters. Although GEBCO's gridded bathymetric data sets are presented at 30 arc-second and one arc-minute intervals of latitude and longitude, this does not imply that knowledge is available on seafloor depth at this resolution. Users are advised to consult the accompanying data set documentation before using the data sets.

GEBCO'S global bathymetric data sets shall not be used for navigation or for any other purpose involving safety at sea.

Users are asked to report any problems encountered with the data as this feedback may result in the further improvement of GEBCO's data sets.

software name

Submitted as part of a review for JOSS.

Is it hrds or hdrs? The repo is hdrs, but the directory name and JOSS paper both suggest hrds.

tests getting installed as a separate top level pacakge

The tests are put in a package that is next to the hrds package, and the setup.py uses setuptools.find_packages to to identify the packages to install. So when run, you get a top-level package called "tests" with the hrds tests in it. This is not good.

Here are some thoughts on where to put tests:

http://pythonchb.github.io/PythonTopics/where_to_put_tests.html

I've provided a PR here that puts the tests internal to the package:

#18

I also added formatting, etc changes to the setup.py in that PR.

Test doc string

Submitted as part of a review for JOSS.

In the buffer tests, something is wrong with one of the doc strings. It says a 10x10 grid with dx=0.2 and limits of (0,0) and (4,4). Either the size, the limits, or the dx must be wrong.

https://github.com/EnvModellingGroup/hdrs/blob/7c7a095cee17b3af79daa0b3fed874fee5e4e9eb/tests/test_buffer.py#L69-L72

geotransform is not honoured when making buffers

If the raster data has non-trivial geo_transform data, the raster buffer does not honour that info when it is created. You then get spurious coordinate errors and/or NaN values in your final output. Example image from Q attached.

You can see the buffer is rotated and offset from the original data. This is not trivial to fix as the pixels need aligning to the distance given.

Examples - no example datasets included

Submitted as part of a review for JOSS.

The readme includes two examples, but neither the bathymetry files required to run them, nor instructions on where to obtain the files.

Allow minimum/maximum values returned

If carrying out a tidal simulation with wetting and drying in a limited area, other areas will need a minimum depth setting to prevent drying/wetting. This is pretty easy to do on a raster-by-raster basis by passing an array of min/max values, e.g.
minmax = None
would ignore this, but you could do:
minmax = [[None, -5], [None,-3], [None, None]]
for three rasters where depth is -ve and you get a minimum depth of 5, 3 and nothing set respectively. This would have to passed to the Raster interpolator otherwise if the buffer overlaps with the change in min/max depths you would get oddities.

envmodellinggroup / hrds Goto Github PK

hrds's People

Contributors

Stargazers

Watchers

Forkers

hrds's Issues

Recommend Projects

Recommend Topics

Recommend Org