cont-limno / lagosne Goto Github PK
View Code? Open in Web Editor NEWInterface to the LAke multi-scaled GeOSpatial & temporal database :earth_americas:
Home Page: https://cont-limno.github.io/LAGOSNE/
Interface to the LAke multi-scaled GeOSpatial & temporal database :earth_americas:
Home Page: https://cont-limno.github.io/LAGOSNE/
@jsta -- I just tried to get Emily Stanley (@ehstanley) set up with the R package and she doesn't have access. How do I change this?
Tables "lakes4ha.buffer500m" and "lakes4ha.buffer100m" do not import with column names (column names in row 1).
This was present in the legacy code base.
Refactoring the code will need to be done carefully.
The vignettes as currently written require that the full LAGOS dataset is installed and available to lagos_load
. I think this is good because it allows us to describe the contents of the data product. However, automated build testing via CI services (and eventually CRAN) breaks because the they don't (and probably should not) have the data available. The only solution I can come up with is to make the vignettes static. That is to pre-build all the vignette figures and tables and display the code chunks without running them (eval = FALSE
).
Need to sanitize inputs
In tables lakes4ha_buffer100m and lakes4ha_buffer500m, column is "lagoslakeid" after importing through package, but column name in geo v1.040 is "lakes4ha_buffer100m_lagoslakeid" and "lakes4ha_buffer100m_lagoslakeid".
See #15
Some functions have a title in the documentation and other do not. Consider following the rules at: http://style.tidyverse.org/code-documentation.html
Something like the first 10 lines of every table
the data compilation functions will need to be updated
The keywords in LAGOS:::keyword_partial_key()
are not included in any of the user-facing documentation.
For example, ?locus
does not yield any results.
Also buffer100m.lulc
buffer500m.lulc
lakes.geo
lagos_source_program
At the very least, make preprocessing functions operate on column names and not column numbers
A lot of this code may have reinvented the wheel. It is likely that we can simplify a lot of this by wrapping the dplyr
package.
LAGOS:::lagos_compile(version = "1.054.1", format = "rds")
fails with error message:
Error in gzfile(file, mode) : cannot open the connection
In addition: Warning message:
In gzfile(file, mode) :
cannot open compressed file 'C:\Users\Samantha\AppData\Local\LAGOS\LAGOS/data_1.054.1.rds', probable reason 'No such file or directory'
R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
sessionInfo()
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] LAGOS_1.054.1
loaded via a namespace (and not attached):
[1] lazyeval_0.2.0 magrittr_1.5 R6_2.1.2 assertthat_0.1 rsconnect_0.4.3
[6] DBI_0.5-12 tools_3.3.1 dplyr_0.5.0 rappdirs_0.3.1 tibble_1.1
[11] Rcpp_0.12.7
rappdirs::user_data_dir("LAGOS")
"C:\Users\Samantha\AppData\Local\LAGOS\LAGOS"
In addition to the ability select columns by name, allow the user to select pre-defined groups of columns within each table. For example, "atmospheric deposition" that may include multiple variables.
This is the bulk of the old LAGOS R package but it could be much cleaner by using functions to reduce redundancy versus copy-paste.
Store data using https://github.com/hadley/rappdirs
Possibly implement some of the ideas in https://github.com/richfitz/datastorr/blob/master/vignettes/datastorr.Rmd
epi.nutr -> epi_nutr
lakes.limno -> lakes_limno
exactly which ones is an open question
sampling event
, lagoslakeid
, etc.
LAke multi-scaled GeOSpatial & temporal database -> Tools for Interacting with the Lake Multi-scaled Geospatial and Temporal Database
@jsta where is the best place for this information? As a user, I could imagine wanting this to be in two places: 1) some sort of documentation listing each variable within each table, along with some metadata (units, plain English description, etc). We could have documentation for each table, but where (in the package structure) would this go? 2) in a table format, similar to the info table that is currently imported with the rds file.
Add checks to see if data are already loaded. This is what datastorr
/storr
does well.
Should this be done at the table level or the module level?
A nice feature of the package might be to compile ALL previously published LAGOS data (I think all of it is on the LTER portal). So the function could be something like lagos_published(paper = c("Oliver2015", "Lottig 2014"))
We might consider adding additional alias terms relative to the datasets. For example, we might list "chla", "colora", and "doc" as aliases for the epi.nutr
table. This would enable ??LAGOS::chla
See #22, and the output of devtools::test()
. Need to throw helpful error messages when queries try to return data that does not exist.
This was done to be consistent with the legacy loading scripts but can be a real pain for analysis. Is there any reason not to load as character strings instead?
One idea might be to use the units package
Inquiring minds want to know.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.