Git Product home page Git Product logo

culvert-toolkit's Introduction

Culvert Analysis Toolkit

NOTE: this repository is work-in-progress

The Culvert Analysis Toolkit is a collection of GIS tools designed to support analysis of culvert performance within their watershed. It uses a TR-55 Model and parameters derived from geospatial data to calculate peak-flow and culvert capacity.

This software encompasses work originally developed by the Cornell Soil & Water Lab (on GitHub @ https://github.com/SoilWaterLab)

Installation and Use

Check out the documentation at https://civicmapper.github.io/culvert-toolkit.

Credits/Contributors

This repository represents an evolution of a previous effort, the Peak-Flow Calculator, which was a hard fork of the Water Lab's CulvertEvalution repository.

Made in collaboration with the Cornell S&W lab, this repository includes additional updates to the science of the analysis from CulvertModel_2.1, as well as a ground-up rebuild of the supporting codebase.

  • Original peak flow analytics are based on the culvert evaluation model developed by Rebecca Marjerison at the Cornell Soil and Water Lab in 2013, as CulvertEvalution
  • David Gold, python script development, August 4, 2015
  • Object-oriented structure and resiliency updates built by Noah Warnke, August 31 2016 (no formulas changed).
  • Updated by Zoya Kaufmann June 2016 - August 2017
  • Merged with older versions by Tanvi Naidu June 19 2017
  • Fork, refactor, and creation of CLI and ArcMap interfaces by Christian Gass @ CivicMapper, Fall 2017, as peak-flow-calculator
  • Updates for use within ArcGIS Pro by Christian Gass @ CivicMapper, Spring/Summer 2019, in peak-flow-calculator
  • CulvertModel_2.1by Jo Archibald in 2019
  • Drain-It / Culvert Analysis Toolkit repository by CivicMapper, 2021-present

culvert-toolkit's People

Contributors

gassc avatar

Watchers

 avatar  avatar

culvert-toolkit's Issues

flagging suspect peak flow results

There are a handful of types of results we should consider automatically flagging for QC purposes:

  • small watersheds (<1 acre)
  • time of concentration <0.1hr (6 min)
  • very large watersheds (>25 square miles)

(continue to catalog in this issue)

Handle missing or no rainfall rasters

A user encountered an issue where the rainfall download tool rain successfully, but generated an empty JSON file. The culvert calculator ran too, but without rainfall rasters, all peak flow estimates were empty. The telltale sign (other than the empty output) was that the processing time for calculating average rainfall per catchment per storm frequency was 0. None of that should have happened.

This should be handled:

  • as an input validation step in the Python toolbox (#14)
  • in the tool itself: we shouldn't run delineations if there aren't any rainfall rasters to reference, and we shouldn't kick things off if the JSON file is empty.

Accounting for upstream culvert capacity in the peak flow at downstream culverts

Currently upstream culvert capacity is not accounted for in the calculation of peak flow at downstream culverts. This is because we run a delineation for every culvert individually.

To account for the modulation of incoming flow by upstream culverts, we'd need:

  • to delineate a watersheds in the AOI at once and get a layer of non-overlapping boundaries
  • determine the network relationship between a culvert and any upstream or downstream culverts

In the original https://github.com/civicmapper/peak-flow-calculator/ toolbox, we ran watershed delineation on all locations at once, with the results being non-overlapping watersheds.

Creating a network may just be a matter of having those watershed boundaries (as a grid), a flow accumulation grid, and some TBD map algebra.

Peak Flow calculator math errors/warnings

Somewhat sporadically, seeing warnings (not exceptions) popping up with the Peak Flow calculator

drainit\calculators\runoff.py:166: RuntimeWarning: divide by zero encountered in log10
  qu = 10 ** (CONST_0 + CONST_1 * numpy.log10(tc_hr) + CONST_2 *  (numpy.log10(tc_hr))**2 - 2.366)
drainit\calculators\runoff.py:166: RuntimeWarning: invalid value encountered in double_scalars
  qu = 10 ** (CONST_0 + CONST_1 * numpy.log10(tc_hr) + CONST_2 *  (numpy.log10(tc_hr))**2 - 2.366)

Modeling upstream max storage volume

(Some notes)

  • Knowing the elevation of the crown of the road at the crossing, project that elevation upstream and create the lake/bathtub/reservoir--the area that would fill before it would spill over the roadway.
  • Figure out the volume of that area--the upstream max storage volume.
  • Compare that volume to capacity, and determine if that volume would translate to a rate that would exceed the capacity (which is also a rate)

More to come.

New delineation-only workflow and tool

Create a new workflow and toolbox tool that does everything up to the capacity/return period analysis of the existing culvert capacity workflow.

Prepped NAACC geodata as input, config file and geodata (feature class) as output.

Make flow length parameter optional in culvert analysis tool

The flow length raster is an optional input in the workflow script.

Currently the corresponding python toolbox parameter is set to required. Make it optional. Update help text accordingly (explain that it will be generated on-the-fly, which will extend the analysis time noticeably).

Filter out all crossing types except culvert and multi-culvert

Switch from the deny list-based approach, where the code specifies which types of NAACC points aren't permitted and allows everything else through, to an allow list-based approach, where we only specify the types of crossings allowed through and deny everything else: Culvert and Multi-Culvert.

scratch workspace detection issues

Problem

We encountered an error with calculator execution right off the bat like this:

Traceback (most recent call last):
  File "<string>", line 111, in execute
  File "C:\Users\User.Name\AppData\Local\ESRI\conda\envs\CornellCulvertTool\Lib\site-packages\drainit\workflows.py", line 543, in __init__
    self.save_config_json_filepath = f'{self.gp._so("drainit_config", suffix="", where="folder")}.json'
  File "C:\Users\User.Name\AppData\Local\ESRI\conda\envs\CornellCulvertTool\Lib\site-packages\drainit\services\gp\_esri\__init__.py", line 171, in _so
    location = Path(env.scratchFolder)
  File "C:\Users\User.Name\AppData\Local\ESRI\conda\envs\CornellCulvertTool\Lib\pathlib.py", line 1082, in __new__
    self = cls._from_parts(args, init=False)
  File "C:\Users\User.Name\AppData\Local\ESRI\conda\envs\CornellCulvertTool\Lib\pathlib.py", line 707, in _from_parts
    drv, root, parts = self._parse_args(args)
  File "C:\Users\User.Name\AppData\Local\ESRI\conda\envs\CornellCulvertTool\Lib\pathlib.py", line 691, in _parse_args
    a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not NoneType

Failed to execute (CulvertCapacityPytTool).

The error was that arcpy.env.scratchFolder was returning None. This folder is relied on for temporary file i/o.

Cause

It looks like the global environment setting in ArcGIS Pro for the scratch workspace has to be set explicitly to a folder. That setting is here:

image

It can not be set using a Model Builder-style variable, e.g., %my_scratch_folder%.

That was the issue in this particular case.

Solution

For end-users

Set the scratch workspace explicitly. It's not clear if model builder variables are supported as global environment settings.

In the codebase

We should fall back to the system temp directory if the configured ArcGIS Pro scratch folder/workspace isn't available for some reason. That change would happen in the scratch output path generator function:

def _so(self, prefix, suffix="unique", where="in_memory"):

release strategy relative to ArcGIS Pro versions

Currently we're building the release with the latest available ArcGIS Pro-based Python environment, but have no docs or indicators otherwise of which version of Pro it works with.

It's largely the Python/Anaconda version that might change between ArcGIS Pro releases. Geoprocessing tools may change as well but that's less likely with the tools we're relying on.

Once we start formally publishing releases, we can include that metadata.

NAACC data schema change

THe NAACC schema changed, breaking the NAACC data ingest tool. An end user discerned what changed, their summary follows:

I was able to get the ingest tool to work by making some edits to the downloaded detailed data from NAACC. They changed some of the columns and the value fields used (now uses 0/1 instead of TRUE/FALSE) or rather than allow for an empty value, it now inserts NULL.
They have Inlet_Type in twice, to work around you have to add a column to AB, Copy BI and past into column AB (Inlet_Type) then at column BI, Add a 1 to the end of column heading to make it Inlet_Type1…. This matches how the tool was originally created.
The Approved Column (F), Maine_Private(AI), Inlet_Type(AB),No_Crossing (AJ) now utilize 0 or 1 rather than TRUE/FALSE, change to TRUE/FALSE.
Many Columns now include NULL rather than a 0 or empty value such as Inlet-Type(AB), Terrestrial_Passage_Score (AS), Armoring(AW), Barrier_Severity(AY), Dry_Passage(BB), Inlet-Grade(BD), Inlet_Structure_Type(BG), Inlet_type1(BI same as above to add a 1 to end), Outlet_Grade(BR), Slope_Confidence(BZ).
Editing the csv to match columns, replace o or 1 with TRUE/FALSE and remove NULL allowed table to ingest….

tqdm causes geoprocessing tools to *appear* to fail

tqdm provides progress bars at various stages in the workflow when the tools imported as a library and run outside of ArcGIS Pro. However, it seems that the messages framework used for geoprocessing tools inside of ArcGIS Pro doesn't like tqdm outputs. It only shows the tqdm bar after the tool has completed running and results in a tool error...so it looks like the tool has failed when in reality it has completed everything it needed to do.

Either remove (for now) or figure out how to conditionally use tqdm based on the context.

New crossing analytics-only workflow and tool

Meant as way to pick up from the partial workflow described in #11: takes the workflow config file as input, and runs the rest of the capacity/return period analysis. Update config and geodata are the outputs.

Culvert snapping edge cases

Need to handle:

  • pre-existing x and y fields
  • duplicate culvert surveys that snap to same location = missing geometries for every record after the first during join

Workflow modification: only run delineations per crossing (point group)

Currently, in the culvert capacity workflow, the peak-flow calculation is run for every record in the NAACC table. This calculation includes costly raster operations like delineation and flow length.

In a best-practices data prep workflow, all culverts at a common crossing have been snapped to the same location (flow line) on hydrologically-corrected DEM. That means more every culvert at the same location will have the same peak-flow. We could cut down on geoprocessing time quite a bit for multi-culvert crossing situations if we only run once and then "share" the output with the other culvert records at the same crossing.

There are at least two approaches here:

  1. run peak flow for records in NaaccCrossing model (which includes a list of culverts via NaaccCulvert model).
  • for this approach to work well, it assumes that all culverts with common Survey_Id have in fact been snapped to the same location. Are there cases where they might not?
  1. analyze geographic coincidence of culverts (using say, the resolution of the DEM as a snapping distance), and use those groups to run peak flow, ignoring any association based on the Survey_Id
  • this approach potentially would handle situations where there multiple surveys for a crossing

Changing this would require a pretty significant refactor to the existing workflows, but not underlying calculators or lower-level geoprocessing workflows. That refactor may best be able to be implemented in parallel with other tools.

field lists for NAACC snapping tool

generate a list of fields based on the input feature class to use in auto-populating a select menu for the target and source join field parameters.

Which culvert location to use for final results?

In a NAACC Capacity calculator workflow that utilizes a hydro-corrected DEM and snaps culverts to crossing locations on hydro lines, should we:

  • return the results at those locations as well?
  • Or, move the final results back to the locations in the NAACC GIS Latitude/Longitude fields?

Move crossing-level analytics to the NaaccCrossing model

NaaccCrossing is currently just a container for grouping NaaccCulvert instances, but we could offload a lot of the business logic around determining max return period per-crossing, which is currently performaed in CulvertCapacity workflow methods, to the model (or an associated/coupled set of service functions).

This work would be a prerequisite for #16, because we'd be changing where iteration happens.

Respect map-based feature selection in PYT tools

Selections of feature layers used as tool inputs should follow the convention of other system tools and limit what the tool processes.

NAACC Culvert Capacity ArcToolbox GUI

While the input in the above example indicates that there is selection, the tool currently ignores that. Need to look in how the parameters are being passed in.

flag duplicate survey situations in the NAACC Data Snapping Tool output table

Add a flag indicating if the feature has been resnapped or is relying on its original location. This can be a boolean field.

Background:

Crossings are surveyed more than one time, and it is possible for more than one survey record to come through in NAACC (sometimes one is better than the other, or its a mixed bag, but that's another issue). When these are snapped to a single point on a hydro line using the available tooling, only the location and one of the IDs of the original records is preserved.

So, during the re-snapping process, any original records that don't have a snapped geometry record need to be handled in some way. Currently the code falls back to the feature's original location. However, it would be useful to be able to have a way to find these as a final QAQC step before running the peak-flow/capacity tools.

Having a queryable field will simplify this process. We could also, through ArcPy Toolbox, apply a symbology to the result when auto-added to the map.

validate precipitation source configuration file in PYT tools

Validate the selected precipitation source configuration file in any tool in the PYT that uses it. Return parameter error+message if validation fails.

The file is validated within the workflow itself, but it'd be nice to catch any issues before the tool runs.

Change rainfall units in output geodata

The unit for NOAA rainfall rasters is 1000ths/inch. Internally, during the peak flow calculation, these values are converted to centimeters on the fly. The saved "sheds" feature class still includes the original 1000ths/inch measurement, which is not the most obvious unit.

Consider saving the downloaded rasters in either inches or centimeters. This could happen in this function:

(

def create_geotiffs_from_rainfall_rasters(
)

That would allow a minimal amount of changes to the code elsewhere.

Other changes would include any models that reference rainfall units.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.