Git Product home page Git Product logo

noaa-owp / gval Goto Github PK

View Code? Open in Web Editor NEW
22.0 3.0 0.0 29.28 MB

A high-level Python framework to evaluate the skill of geospatial datasets by comparing candidates to benchmark maps producing agreement maps and metrics.

Home Page: https://noaa-owp.github.io/gval/

License: Other

Dockerfile 0.09% Python 17.97% Jupyter Notebook 81.94%
earth-science environment evaluation-framework flood-inundation gdal geospatial hydrology python research science

gval's People

Contributors

fernando-aristizabal avatar gregorypetrochenkov-noaa avatar nickchadwick-noaa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

gval's Issues

Create categorical contingency table data structure

Implement a standard schema for contingency table structure.

This issue depends on #38.

Current behavior

  • Some post processing is implemented to index cross-tabulated counts by unique values in candidate map (rows) and benchmark map (columns).
  • No ability to keep track of candidate or benchmark attributes or process parameters is in place.

Expected behavior

  • The above structure is great for readability but does not account for the various hierarchies of samples including sub-samples, band/features, maps, and catalogs.

  • A contingency table structure is required that reports cross-tabulated counts for each sample as well as their associated attributes.

  • Methods to aggregate/groupby crosstab counts by level or associated attributes is important.

  • Ways of melting and pivoting this structure to something more human readable like a cross-tabulation table indexed by unique values in candidate map (rows) and benchmark map (columns) would be nice to have as well.

  • Some method of tagging each column with metadata is necessary. Current metadata source (candidate, benchmark, or process) and hierarchy (catalog, map, band/feature, sub-sample).

  • What data structures are necessary to account for this?

  • What classes should be created for this?

Screenshots

Update Test coverage to 95% or greater

Short description explaining the high-level reason for the new issue.

Current behavior

Coverage for only 60ish %

Expected behavior

Coverage for 95% of codebase

Steps to replicate behavior (include URLs)

Screenshots

Constrain netcdf args in `xr.open_rasterio()`

NetCDF args in open_rasterio.

Current behavior

  • Netcdf

Expected behavior

  • We are not currently supporting netcdf related open args: variable, group, decode_times, and decode_timedeltas
  • We should avoid exposing arguments to user.

Dealing with element-wise comparisons that have NDV and a value pairing.

Dealing with element-wise comparisons that have NDV and a value pairing.

Current behavior

  • Element-wise pairing with value-value pairs yields a value.
  • Element-wise pairing with NDV in either candidate or benchmark yields NDV.
  • This functionality is done in gval.compare.compute_agreement_xarray() along with the comparison_function arg.

Expected behavior

  • User's may want to denote situations where candidate or benchmark has a value and the other is NDV.
  • This represents a "value mis-alignment" that user's may want represented in the agreement map.

Reading test data from S3.

Reading test data from S3.

Related to #12 and #14.

Current behavior

  • Test data is read in ./data

Expected behavior

  • Read test data from S3.

Steps to replicate behavior (include URLs)

Screenshots

Automate Test Data Creation

Automate Test Data Creation

Current behavior

  • All test data is being generated manually
  • All test data is being uploaded to S3 manually
  • All test data is being loaded in to a root directory in a flat structure

Expected behavior

  • Test data will be generated from base cases or directly from parameterized tests
  • Test data will automatically be uploaded to s3
  • Data will be organized in directories based on function
  • Metadata will be listed

DF Schema's to handle multiple possible dtypes for a given column.

Current behavior

  • Certain columns defined in the schema's such as band, tp, counts, candidate_values, etc could possibly be more than one possible dtype including str, int, or float.
  • These are all currently set to str or float and only one allowed per column.

Expected behavior

  • Allow for multiple types of dtypes for these columns.
  • This is like done by subclass pander.dtypes and creating your own datatypes for pandera.

Installation Instructions on Readme.md

Making installation instructions on readme.md accurate.

Current behavior

  • Current instructions for installation on readme use Pypi and dockerhub based services.

Expected behavior

  • Since we are currently not supporting pypi or docker hub, we should:
  1. Support these
  2. Remove them and move the contents under the "packaging" section to installation which would enable users to install the package for user purposes. We can add git clone to these instructions.

Steps to replicate behavior (include URLs)

Screenshots

Extend contingency table to accept band/variable names/numbers from both candidate and benchmarks.

Extends bands/variables supported in contingency table.

Current behavior

  • Only a single band number or variable name is taken from the either the candidate or benchmark maps.
  • Candidate or benchmarks could have different bands with different numbers or names.

Expected behavior

  • Allow for the creation of two columns in contingency table that denote the band name/value for both candidate and benchmark maps.

`_is_not_natural_number()` changes

Current behavior

  • _is_not_natural_number() returns -2 or a ValueError.
  • Negative 2 has not actual purpose other than forced to return a not a None type from Numba.

Expected behavior

  • return bool or raise a custom exception based on raise_exception: bool argument.
  • Make sure there are test cases for the exception and bool cases.

Create predicates for DataArrays and Datasets

Current behavior

  • There are several informal predicates within gval/compare.py that test for matching array sizes and dimensionality.

Expected behavior

  • These checks should all be consolidated into some sort of module(s) or class(es).
  • If relevant, predicates can have a flag that will raise an exception whether builtin or custom if necessary.
  • This is expected to reduce code reuse and better organize predicate behavior in general.
  • Examples of predicates with associated, optional custom exceptions that could be implemented are _is_2d_DataArray, _is_3d_DataArray, _is_DataArray, _is_Dataset, _has_rasterio_accessor, and _DataArrays_of_same_shape.

New Value Introduced with Reproject Match

Current behavior

  • A new value is being introduced when reprojecting candidate_map_1 to benchmark_map_1.

Steps to Reproduce

  1. candidate_map_1.rio.reproject_match(benchmark_map_1)

Map Counts of Agreement Encodings to Statistical Metrics

As of now there is no mapping of permuted agreement encodings to categorical statistic.

Current behavior

Functionality not available yet

Expected behavior

For all available parameters in metrics, e.g. (tp, tn, fp, fn) have a permutation of all available arguments

Example of doing so:

 import itertools
 from gval.statistics.categorical_statistics import CategoricalStatistics
 # Example higher abstraction calling all metric functions
 cat_stat = CategoricalStatistics
 params = cat_stat.get_all_parameters()
 len_params = len(params)

 # Hypothetical counts
 counts = [120, 30, 11, 13, 20]

 # Permute through all combinations of tp, tn, fp, and fn given counts
 # Given 4 params and 5 choices there will be 120 different combinations
 tiled_counts = np.tile(counts, (len_params, 1))
 arg_dicts = [{key: val for key, val in zip(params, combo)}
              for combo in itertools.product(*tiled_counts)
              if len(np.unique(combo)) == len_params]

Design Jupyter Notebook for second tier user functionality

There is no public facing API for user functionality. While this issue does not encompass the creation of a gval accessor. It does cover the creation of a Jupyter notebook aimed to be released for preliminary user testing and feedback.

Current behavior

  • Key components of pipeline are fragmented private functions.
  • The pipeline is not complete as metric tables, agreement maps, and other outputs are fully defined.

Expected behavior

  • We need to create a Jupyter Notebook that shows how user interaction is completed with a second tier of public functions.
  • The public API should be an Xarray accessor called gval.
  • Current examples include: rxr.open_rasterio() (for candidate and benchmark), candidate.gval.spatial_alignment(benchmark, ...), candidate.gval.compute_agreement_map(benchmark), crosstab_table = candidate.gval.crosstab(benchmark), crosstab_tab.gval.compute_metrics(metrics, etc) (need pandas accessor), include any relevant plotting, and lastly we need to show how to serialize this information.
  • Try to include plenty of comments to make user aware of what is happening.
  • Have this notebook ready for in-house user feedback by NLT April 15.

Investigate the feasibility of developing a GVAL accessor for xarray.

Investigate how a GVAL accessor for Xarray would work.

  • Investigate here.
  • How would accessor methods work in their return types? Under what circumstance would we lose the gval accessor?
  • What methods and attributes would we want to the accessor to have??
  • Should we let them know based on their suggestion ("To help users keep things straight, please let us know if you plan to write a new accessor for an open source library.")
  • Investigate extended projects that have accessors already to see how they are written.
  • Investigate tools that make documenting accessors easier.

Szudzik and Cantor Pairing Function Warnings with np.nan

Current behavior

  • Szudzik and cantor pairing functions are raising RuntimeWarnings for np.nan.

Expected behavior

  • Should not raise.

Steps to replicate behavior (include URLs)

>>> from gval.comparison import pairing_functions
>>> import numpy as np
>>> pairing_functions.szudzik_pair_signed(1,np.nan)
<stdin>:1: RuntimeWarning: invalid value encountered in szudzik_pair_signed
nan

Compare functions should be parametrized with strings

Assigning comparison functions with strings.

Current behavior

  • gval.compare.compute_agreement_xarray() currently accepts the comparsion_function argument as callable or a string for the "pairing_dict" case.

Expected behavior

  • This argument should accept strings for the cantor and szudzik cases.
  • This would involve creating a registry of pairing functions.

Steps to replicate behavior (include URLs)

Screenshots

Docker Compose

Consider using docker compose for stability and convenience.

Current behavior

  • Mount points for docker run -v <src>:<trgt> have to be defined by user.

Expected behavior

- Consider using `docker compose` yaml to provide easy runtime instructions as to avoid having users/developers do this with every `docker run` instance and to enforce target mount points inside containers.
- See [this](https://docs.docker.com/get-started/08_using_compose/) for more information on docker compose.
- Account for any changes to bind points that may arise from #12. 

Decide how to encode NDV in agreement map

Decide how to encode NDV in agreement map given candidate and benchmark NDV's that maybe different or not.

Related to #22 because it deals with np.nans as NDV.

Current behavior

  • No NDV management being considered right now for output of gval.compare.compute_agreement_xarray().
  • How does one use rxr.rio.encode_nodata() to write ndv to output xarray or file?

Expected behavior

  • Decide on method whether user parameter or use of pairing functionality.

AWS Config Keys

Consider redoing AWS config behavior.

Current behavior

  • _set_aws_environment_variables function within tests/utils.py is set to run prior to every test script reading keys from a CSV.
  • Keys are stored within repo in data dir which is kept private by .gitignore

Expected behavior

  • Consider using an officially formatted AWS config file.
  • Consider requiring users to put this file outside of the repository to avoid an accidental commits.
    • Developers can access file by using a separate volume -v defined at docker run time.
    • Consider using to AWS_CONFIG_FILE environment variable and setting within Dockerfile to point to volume mount location.
  • The above procedure could avoid security issues and deprecate the _set_aws_environment_variables function.

Steps to replicate behavior (include URLs)

Screenshots

Revisit `rasters_intersect()`

Current behavior

  • rasters_intersect() currently checks for intersection without use of shapely geometry which maybe causing problems.
  • Additionally it assumes rasters have the same projection.

Expected behavior

  • Verify if this is having problems or not then patch.
  • It might be necessary to combine transform_bounds and rasters_intersect to avoid issues with different projections.

Steps to replicate behavior (include URLs)

Screenshots

Make Repository Public

Make repo public.

Current behavior

Currently our repository is private, only being accessible to internal users at OWP.

Expected behavior

Repository will be public, and GitActions/GitPages will be available

Default masking and scaling behavior.

Current behavior

  • Default behavior in xr.open_rasterio() does not mask and scale data.

Expected behavior

  • Let's constrain opening to keeping mask_and_scale=True.

Dask array test coverage

Coverage of Dask arrays.

Current behavior

  • There is currently no testing for Dask arrays in place.

Expected behavior

  • Dask arrays should be covered in tests.
  • Dask arrays are employed through the chunks argument of rxr.open_rasterio()
  • Related arguments within rxr.open_rasterio() that could affect test status include cache and lock.
  • If we believe that these arguments could affect test status for functionality within gval, they should be covered within tests.

Multi-band support within `crosstab_xarray()`

Current behavior

  • crosstab_xarray() does not currently allow for multiple bands either as 3D arrays or as xr.Dataset.

Expected behavior

  • implement ability to allow for xr.DataArray with multiple bands or xr.Dataset with multiple variables.
  • Only support 3D datasets. So a check must be put in place that does this for both DataArrays and Datasets.
  • Some ability to preserve bands or not in computation of cross tabs must be included.

Steps to replicate behavior (include URLs)

Screenshots

Implement library of categorical statistics.

Generate a sub-package, module, or class of basic level categorical statistics.

This issue is related to by independent to #17.

Current behavior

  • No functions to compute categorical statistics are being computed.

Expected behavior

  • We need to have a sub-package, module, or class of basic level categorical statistics.
  • These functions should be two or multi class agnostics only accepting TP, TN, FP, and FNs=None as input.
  • Some basic examples include MCC, CSI, TPR, F1-Score, Precision, Recall, Accuracy, FAR, etc.
  • How do we account for functions that are mathematically equivalent but go by differing names?
  • How do we account for cases where users can pass strings to denote which statistics to compute?
  • Including references for every statistics.
  • Further functionality should be later generated to apply these functions to the contingency table structures proposed in #17.

Implement Pipeline on GitActions

Develop build process to test build, unit tests, lint, and assess coverage.

Current behavior

No such pipeline exists

Expected behavior

Develop pipeline and have badges for successful build, passing unit tests, and unit test coverage.

Apply categorical metrics to contingency tables.

Developed categorical metrics in #18 need to be applied to contingency table structure developed in #17.

Current behavior

  • There is no application of metrics computation of them on contingency tables.

Expected behavior

  • How do we apply functions developed in #18 to compute agreement metrics at the varying sample hierarchies thus preserving hierarchy identifiers and their attributes?
  • How do we account for two-class vs multi-class cases?
  • How do we account for one-vs-one and one-vs-all methods of computing metrics for multi-class cases?

Redo testing functionality

Current Behavior

  • Current testing functionality is very limited.
  • It does not use pytest very well.
  • It does not have a good structure.
  • It does not begin to test all of the variation of test cases that could be presented to implemented functionality.

Proposed Work

  1. Do in-depth research for PyTest functionality.
  2. Restructure tests to better mimic package structure.
  3. Make more test data.
    • Features and nuances to test:
      • xr.open_rasterio() args:
        • xr.Datasets with singular and multiple variables
        • xr.DataArrays with multiple bands or Variables with multiple bands.
        • xarray with masking and/or mask_and_scale (np.nan).
        • Accounting for sub-objects of xarrays' in testing including coordinates, attrs, encodings, etc.
          • Testing with xr.testing.assert_equal, xr.testing.assert_identical, or np.testing.assert_equal
        • Investigate how using dask, chunking, caching, and windowing affects fidelity of tests.
      • Parameters passed to gval.compare.crosstab_xarray() and gval.compare.compute_agreement_xarray()
        • No testing is available for these parameters.
        • No capabilities accounting for allow or deny listing of values is passed for gval.compare.compute_agreement_xarray().
      • Benchmarking
        • General computational performance across files with varying amount of samples to understand performance limitations.

Decide how to manage np.nans

np.nans can take float or float64 form

Current behavior

  • When loading unmasked arrays, there are no np.nan
  • When loading masked arrays via open_rasterio argument masked and mask_and_scale np.nan takes np.float64 form.

Expected behavior

  • Need to deal with data type discrepancy as tests fail with masked arrays.

Steps to replicate behavior (include URLs)

Screenshots

Investigate behavior of `xr.open_rasterio()`

Investigate behavior of xr.open_rasterio().

  • xr.open_rasterio() args:
  • xr.Datasets with singular and multiple variables
  • xr.DataArrays with multiple bands or Variables with multiple bands.
  • xarray with masking and/or mask_and_scale (np.nan).
  • Investigate how using dask, chunking, caching, and windowing affects fidelity of tests

Datatype Issues between 32 bit types and 64 bit types

When usin pairing functions RuntimeWarnings arise from overflow and invalid values in the math between datasets with int or float32 ndthose with in or float64.

Current behavior

RuntimeWarnings are thrown and questionable output is made

Expected behavior

Align datatypes to 64 (highest precision) to avoid these RuntimeWarnings so none are thrown

Steps to replicate behavior (include URLs)

Screenshots

Location of references.bib

What is the best location for references.bib?

Current behavior

  • Current file is located in root directory.

Expected behavior

  • Is a bib file able to render with a sphinx/readthedocs project?
    • If so, can we move this to docs so that it eventually turns into a references page within the documentation website?

Steps to replicate behavior (include URLs)

Screenshots

Add additional datasets to S3

Current behavior

  • There is only one pair of candidate and benchmark maps on S3.

Expected behavior

  • Need to account for a variety of test cases:
    • multi-band rasters
    • adding COGs and compressed datasets for performance
    • adding non-homogenized and homogenized datasets
    • varying datatypes (int, float, etc)
    • grouping datasets by use case and adding meta-data. Try to leverage current module structure.
    • generating datasets with all possible combinations of parameters.

Example candidate dataset: s3://fernandoa-bucket/foss_fim/test_cases/ble_test_cases/12100202_ble/testing_versions/20210902_C892f8075_allBle_n_6_MS/100yr/inundation_extent_12100202_MS.tif

Example benchmark dataset: s3://fernandoa-bucket/foss_fim/test_cases/ble_test_cases/validation_data_ble/12100202/100yr/ble_huc_12100202_extent_100yr.tif

Steps to replicate behavior (include URLs)

Screenshots

Live Documentation GitPages or ReadTheDocs

Live documentation is necessary for a public release to keep interested users informed on how to use the package.

Current behavior

None exist

Expected behavior

Live documentation

Pip Install Permissions

Short description explaining the high-level reason for the new issue.

Current behavior

  • Dockerfile conducts pip installs as root.
  • Installing pip installs as root in a virtual environment seems to generate pers

Expected behavior

  • Users should be able to install with pip in active containers without permissions issues.

Steps to replicate behavior (include URLs)

  1. Enter interactive session of current image, docker exec -it gval bash
  2. Run $VENV/bin/pip install xarray-spatial
  3. The screenshot below shows how permissions are being managed.

Screenshots

Screenshot 2023-02-17 at 12 29 29

Package everything in src directory

Packaging and unit tests will work more seamlessly with code existing in src directory. The local package are not accessible as is pending updates to Jupyter notebooks.

Current behavior

Local installs are not always successful and are not accessible from notebooks

Expected behavior

Local installs will be successful and accessible from notebooks

Allow listing within `gval.compare.compute_agreement_xarray()`

Allow & deny listing within gval.compare.compute_agreement_xarray()

Current behavior

  • No functionality exists to permit allow listing within gval.compare.compute_agreement_xarray().
  • Some functionality for this is included in gval.compare.crosstab_xarray().

Expected behavior

  • Expose ability to do allow listing within gval.compare.compute_agreement_xarray() similar to what's available in gval.compare.crosstab_xarray().

Make a pairing function module and registry of pairing functions.

Current behavior

  • The pairing functions supported are currently in gval/compare.py. This module has grown too large and not specific enough for pairing functions.
  • There is no clear way for user's to access pairing functions.

Expected behavior

  • User's need a specific location to access pairing functions from.
  • User's should also be able to access pairing functions using a string identifier that accesses a registry of paring functions.
  • Additionally, user's should be able to register new pairing functions as needed.

Explore dictionary based pairing function

Explore alternative pairing functions such as dictionary based pairing function.

Related to #24 and #25.

Current behavior

  • Only supported pairing functions for categorical data types is cantor and szudzik.

Expected behavior

  • Explore alternative pairing function based on dictionary of pairing and expected output such as:
{
    (c0, b0) : a0, 
    (c0, b1) : a1, 
    (c1, b0) : a2, 
    (c1, b1) : a3, 
}
  • where cX represents a unique value in candidate map, bX unique value in benchmark map, and aX unique value in agreement map.
  • This should be a numba vectorized/jit function for performance.

Change data source to remote S3

Change test data source to S3 and no longer pull down files locally.

Current behavior

Pulls down files from S3 if not present locally

Expected behavior

Read directly from S3

Steps to replicate behavior (include URLs)

Screenshots

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.