Working as a data scientist in the environmental field with Tetra Tech (since 1994).
Working in R since 2006 and Shiny since 2017. Using GitHub for version control since 2016.
Quality control checks on continuous data. Example data is from a HOBO data logger with 30 minute intervals.
Home Page: https://leppott.github.io/ContDataQC/
License: MIT License
In the QC or Stats portion show the number of times a parameter exceeds a threshold value.
Include a table and a plot. The plot would have the threshold identified with a line on the plot.
Only the file version. "Bad file" error message.
Create data output options that will format the data for other data analysis packages.
Get with Sys.getenv("USERNAME") and add to all report RMD files.
Introduced with v2.0.1.9005 with fixes for offset time data. Only affects these types of files.
Add the ability to calculate RBI as a new function.
Jen. 2017-03-08. Confirmed on 23rd.
Want ability to create new thresholds and save them.
Example code assumes other steps have been done. Need a stand alone example.
For some plots the legend covers up data. Move outside of the plot area. Below?
Aggregate report using seconds instead of minutes for sampling interval.
Add ability for the user to view/modify the thresholds and user defined values.
First step may be just to export to file so user can create a different config file.
Plot data with flags.
Red dots = "F"ail
Orange dots = "S"uspect
May need to be a separate Report, e.g., Report_QC_plotflags.rmd.
Moving from R 3.4.2 to R 3.4.3 the vignette does not install on a clean install.
In the non-File version of Aggregate the QC report is run inside the loop so it runs on each individual file. It should be run on the final file that was put together.
Fixed when creating fun.Aggregate.File(). Need to make the change in fun.Aggregate().
Not formatted for markdown and is old.
Use "toupper" for error checking.
Season not being assigned for Dec 31.
Comment from user.
Files are using user input. May want to use content of files. For example, if request 2016-01-01 to 2016-12-31 but only have 3 months of data the file name could reflect that; 2016-06-01 to 2016-08-31.
v2.0.1.0002.
In Config.R the names were copied and not modified. Says pH and DO.
After working with more data decided upon some adjustments to the thresholds.
Jen Stamp mods added. 20170728.
Add more to the summary stats. Right now just daily means.
7-day running mean, min, max.
Input and Output folders are restricted to a base/root directory.
Should modify so that users can have input and output in different locations.
Add the ability to work on a single file. Current work flow uses the user supplied variables (file name, file data, start date, and end date) to search a directory for all matching files.
Users aren't seeing and running the code for the "parameters" and "directories" when running examples. So things aren't working. Make the examples self contained within each block and don't need to run any previous blocks. Makes for longer examples but should be less confusing.
Flag count for each parameter.
Should probably include Fail and Suspect.
This would allow the user to review the file and sort on those with issues.
OR could have an overall flag count of Fail and Suspect for each row. Not sure if need by parameter.
Suggested by Jimmy while working on the TN files.
Make use of futile.logger library for a more developed logging of errors and printing to the screen for users.
Site - Parameter - Julian - MonthDay
other columns are Years (down and across).
Fill in dates at the intersections.
see box plotting as an option
If have dissimilar files will get errors when using the "file" versions of the functions.
For example, if use Aggregate and use files from one or more SiteID or one or more data type (Water or AW) then get a failure.
Need an explicit check and detailed error to the user.
When using the package with Shiny the reports are not working.
Jen. 2017.03.08.
Gaps not always visible in plots in reports. Ensure not interpolating and/or filling gaps.
Data from different sensors in the same file can have different start times (e.g., x:12 and x:17). When the files are combined this causes every other row for each parameter to be NA.
Test user David (email 20170607).
QC of gage data file not picking up all parameters. Only a select few.
Using "Dunfield" dataset the QC step fails to process (NA for time interval) and Reports show time interval as zero.
Using v2.0.1.9013.
Jen, 2017-03-23.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.