flusightnetwork / cdc-flusight-ensemble Goto Github PK
View Code? Open in Web Editor NEWGuidelines and forecasts for a collaborative U.S. influenza forecasting project.
Home Page: http://flusightnetwork.io/
Guidelines and forecasts for a collaborative U.S. influenza forecasting project.
Home Page: http://flusightnetwork.io/
Could we have the "FluSight Network" and "CDC FluSight Network" text in the top left of the visualization homepage link directly to the README file, so folks who find this page could understand the context?
These folders should have 234 files, only have 233.
Or fix the issue with using generated scores.csv in visualizer early
Some files from the CUBMA model are missing. When running the validate_predictions file, I got an error because the CUBMA/EW51-2010-CUBMA.csv
file doesn't exist. there may be others that don't exist as well, this was just the first one it ran across.
The bin for CU week 53 was not empty, as expected. Probably means probabilities assigned to weeks 1 through 20 need to be bumped up by 1 bin. Will check in with Sasi about a fix.
I'm getting very small differences in my spot-checks of ensemble distributions, particularly in 1-4 week ahead forecasts with TRW, TW, TTW [see example image below)
Very possibly a problem on my end, which I'm checking, but also wanted to ask whether any rounding happens in creating the new distribution?
Will create one file for each ensemble weighting scheme. prospective weights will be listed as season "2017/2018"
Need to create three sets of folders for forecasts
Also, this will require changing file-paths in other scripts (visualizations, score calculations, etc...) that are dependent on these files.
metadata and file submission
Region 6 week 46
The point forecast for US National peak week doesn't match up at all with the underlying distribution. The point forecast is for week 17, but the probabilistic values put it in the week 51-7 range. Could the code to generate the point forecast not be dealing with the New Year's transition correctly?
Previously, in #14 , we had decided that files from EW40 of year k through EW20 of year k+1 would be submitted. However, @tkcy brought up the point that the challenge does not run for those weeks this year, so we are training on weeks that are not in the competition. A question for @craigjmcgowan : what is the "EW" label for the first and last files that will be submitted for the 2017/2018 season?
I updated the metadata files to have these new fields:
These fields should be used to populate the visualization legend.
Need to compare the 2016/17 ensemble outputs to all submitted models to see how they would have compared. Could also compare to an unweighted average of just the models submitted by teams that are participating in the ensemble project (i.e. Delphi, CU, KoT, and LANL models)
Currently the ReichLab-KCDE folder has 227 files, should be 234. Doesn't have any EW20 files.
I.e. onset week, peak week, peak incidence.
@brookslogan to have this script be more portable, can we move this code back to loading the package directly?
I've finished checked the scores generated in travis against scores calculated using the FluSight R package. We're down to 110 discrepancies of greater than 10^-12, all related to peak week in the 2014/15 season. The errors occur in Regions 2, 3, 5, and 7, as well as US National, all of which have week 52 as the peak week.
I looked at one particular error in detail - the target-based model forecasts for Epiweek 53. Specific target is HHS Region 3 peak week. Correct score should be -0.627, summing probabilities of weeks 51, 52, and 53. Travis is assigning -1.11, apparently from summing probabilities of weeks 51, 52, and 1.
If they don't have these bins, we need to fix this. Relevant do discussion in #14
What is "Weighted ILI (%)"?
What does a probability of 0.3 mean? Any individual has a 30% chance of getting the flu? A 0.3% chance?
What does it mean that the mean log score for 3 wk is -7.93?
Perhaps a blub or some hover-over text with overall information would help. I clicked through a few different github pages and am figuring out it's some sort of competition hosted by the CDC. Is this one team's effort? A visualization of all of them put together into an ensemble model?
@lepisma says "Some error in (at least one) metadata file in new PR." Not sure what these are, If you specify exactly, @brookslogan might be able to fix and resubmit.
Have "90%", "50%" and "none" as options.
Travis is saying # Some error in CU-EAKFC_SEIRS 2011-19 for HHS Region 4, Season onset
. Is there some obvious error with this entry file?
FWIW - the (long) transcripts from the site builds (including errors) can be seen here:
https://travis-ci.org/FluSightNetwork/cdc-flusight-ensemble
Error in files don't get detected untill read, which happens while calculating scores. The code right now skips adding rows for those files in the scores.csv
. A list of files with mishappens will make it easy for viz to skip through those, as well as help in debugging errors since we are returning 0 exit code.
We updated the metadata template so the information was more efficiently captured. @brookslogan @tkcy can you check your metadata files to make sure that I didn't screw anything up in the transition?
Log scores of 0 in the summary statistics table are probably wrong, e.g. for CU-BMA and ReichLab-SARIMA1 in HHS Region 3 in 2012/2013.
Maybe choose specific real licenses for the different specifications.
Each ensemble specification will have its own csv file with weights in it. Each file should have the following three columns:
If the file doesn't have one or both of these columns, then we assume them to be the same across all targets or all locations.
We can impose the check that, for a fixed target (t) and location (l) (if specified)
Currently, the script that turns weights into CV ensemble entries needs the above format for the weights file. See, as an example, this file that has a functioning set of example weights.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.