mkoohafkan / rastestr Goto Github PK

An R package for basic testing of HEC-RAS outputs

License: GNU General Public License v3.0

R 100.00%

rastestr's Issues

maintain information on Rivers and Reaches

Right now the read_* methods label columns in the format XS_###.## with cross section stationing. River and Reach information should also be included, either in the format of the column names (e.g. River/Reach/###.##) or as attributes of the data frame.

replace all `h5::[.]` calls with `readDataset(openDataset(.))`

waaaaaay faster

Same station names in different reaches leads to errors in read_sediment

Seems to be caused by the automatic name matching that occurs in dplyr::bind_rows. Need to decide how to handle this. Probably time to replace the "XS_" prefix names with something that combines river/reach/column.

`read_sediment` not returning correct grain classes

Appears to only ever return "Clay".

`reformat_fields` creates extra column when applied to wide table

Creates column "Station" with NA values if the Station column does not exist (e.g. all wide tables). Should not try to do anything to columns that don't exist in dataframe.

add RAS version checking for list_*

list_* functions are not currently checking if they support the RAS version specified. This should be added to the case statement

stop("RAS version ", ras.version, " is not currently supported", .call = FALSE)

hdfqlr transition

Tasks:

Migrate backend from hdf5r to hdfqlr.
Use long table format by default. This will resolve some of the complications with River/Reach/Station naming convention and avoid need for matrix transposing (since HDFql swaps row-column order).
Separate some of the advanced functionality into separate package. Maybe time to rename/start a fresh repository...

major slowdown after transitioning to hdf5r package

test case: Compare the master branch (h5-based) to hdf5r_transition branch (hdf5r-based).

require(microbenchmark)
ras.file = system.file("sample-data/SampleQuasiUnsteady.hdf", package = "RAStestR")
microbenchmark(vol.change.cum <- read_sediment(ras.file, "Vol Bed Change Cum"))

microbenchmark results:

Unit: milliseconds
expr        min       lq     mean   median       uq      max neval
h5       100.51 104.1777 110.2355 106.9395 111.2867 197.3422   100
hdf5r  5685.172 5761.895 5860.317 5813.281 5908.393 6482.691   100

Benchmark for single table reads:

get_dataset_hdf5r = function(f, table.path) {
  x = hdf5r::H5File$new(f)
  g = x$open(table.path)
  res = g$read()
  g$close()
  x$close()
  res
}

get_dataset_h5 = function(f, table.path, type = "double") {
  x = h5::h5file(f)
  g = h5::openDataSet(x, table.path, type)
  res = h5::readDataSet(g)
  h5::h5close(g)
  h5::h5close(x)
  res
}

myfile = system.file("sample-data/SampleQuasiUnsteady.hdf", package = "RAStestR")
mytable =  "Results/Sediment/Output Blocks/Sediment/Sediment Time Series/Cross Sections/Vol Bed Change Cum"

microbenchmark(
  get_dataset_hdf5r(myfile, mytable),
  get_dataset_h5(myfile, mytable)
)

microbenchmark results:

Unit: milliseconds
                               expr      min       lq     mean   median       uq       max neval
 get_dataset_hdf5r(myfile, mytable) 3.898140 3.995330 4.196941 4.093295 4.254240  6.102233   100
    get_dataset_h5(myfile, mytable) 1.558758 1.607431 2.190121 1.677407 1.758579 50.932394   100

mkoohafkan / rastestr Goto Github PK

rastestr's Issues

maintain information on Rivers and Reaches

replace all `h5::[.]` calls with `readDataset(openDataset(.))`

Same station names in different reaches leads to errors in read_sediment

`read_sediment` not returning correct grain classes

`reformat_fields` creates extra column when applied to wide table

add RAS version checking for list_*

hdfqlr transition

major slowdown after transitioning to hdf5r package

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent