Git Product home page Git Product logo

itf_power_bi's Introduction

ITF_Power_BI Repository (ARCHIVED)

ITF Internal Dashboard Refresh

Project Archived

This repository contained the first working data streams for the global COVID-19 dashboards maintained by the International Task Force. The data model required writing directly to CSV files that were tracked in this repository, which became increasingly difficult to manage and update. A code-only repository has superceded this one, using GitHub actions to write out data into CDC AZDL storage. Development will continue there.

New Repository

Project Description:

This project is a repository housing R functions and scripts used in the US Centers for Disease Control and Prevention (CDC) COVID-19 Response International Task Force (ITF) COVID-19 Dashboard

As part of the CDC COVID-19 Response, the ITF Situational Awareness & Visualization (SAVI) Team has created and maintains an interal Power BI Dashboard to assist Task Force and response leadership with situational awareness of the global pandemic and response. The dashboard contains analyses of the most updated global case and testing data from multiple sources. The Power BI report that generates the dashboard runs multiple R scripts in order to refresh, process and update the data as CSV files which are then imported into Power BI for visualizations. The R functions in this project are used to read in case and testing data, apply algorithms and populate the underlying data tables of the report. Access to this dashboard is currently limited to CDC staff only.

The ITF has also created several curated Power BI views of global data on the public CDC COVID Data Tracker (https://covid.cdc.gov/covid-data-tracker/#global-counts-rates) to communicate to the general public the types of analyses that CDC is conducting using international data. The code saved to this repository would be used to populate the data underlying those views in a Power BI Dashboard.

How to run:

Each function that produces a final analytic data set for visualization has been run in the "demo.R" script. These data sets can then be analyzed and visualized directly in R, or imported into Power BI to replicate the visuals produced by the ITF. More information can be found in the description.md file in the Rfunctions folder.

In addition, the "hotspot" trajectory code functions that classify epidemic curve status based on the incidence and rate of change have been separated so that they can be run using any data set that has been formatted properly. For instructions on how to run this code, see the [how to use hotspot code using your own data.md"]("./Rfunctions/how to use hotspot code using your own data.md") document in the Rfunctions folder.

Data sources referenced:

The project uses several publicly-available data sources, including:

the COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University, cases and Deaths data sets:

https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv

https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv

More info here: https://github.com/CSSEGISandData/COVID-19

and here: https://coronavirus.jhu.edu/map.html

Citation: Dong E, Du H, Gardner L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Inf Dis. 20(5):533-534. doi: 10.1016/S1473-3099(20)30120-1

The World Health Organization COVID-19 Global data set:

https://covid19.who.int/WHO-COVID-19-global-data.csv More info here: https://covid19.who.int/

Our World In Data Testing, Hospitalization, and Vaccination data sets:

https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/owid-covid-data.csv

https://github.com/owid/covid-19-data/blob/master/public/data/hospitalizations/covid-hospitalizations.csv

https://github.com/owid/covid-19-data/tree/master/public/data/vaccinations

More info here: https://ourworldindata.org/coronavirus-testing, https://ourworldindata.org/covid-hospitalizations, https://ourworldindata.org/covid-vaccinations

Even more info here: https://github.com/owid/covid-19-data/blob/master/public/data/README.md

Citation: Max Roser, Hannah Ritchie, Esteban Ortiz-Ospina and Joe Hasell (2020) - "Coronavirus Pandemic (COVID-19)". Published online at OurWorldInData.org. Retrieved from: 'https://ourworldindata.org/coronavirus' [Online Resource]

FIND Testing data set:

https://raw.githubusercontent.com/dsbbfinddx/FIND_Cov_19_Tracker/master/input_data/cv_data_download.csv

More info here: https://www.finddx.org/covid-19/test-tracker/

Standardized population data:

https://www.cia.gov/library/publications/the-world-factbook/fields/335rank.html

Continent classifications:

https://pkgstore.datahub.io/JohnSnowLabs/country-and-continent-codes-list/country-and-continent-codes-list-csv_csv/data/b7876b7f496677669644f3d1069d3121/country-and-continent-codes-list-csv_csv.csv

Public Domain

This repository constitutes a work of the United States Government and is not subject to domestic copyright protection under 17 USC § 105. This repository is inthe public domain within the United States, and copyright and related rights in the work worldwide are waived through the CC0 1.0 Universal public domain dedication. All contributions to this repository will be released under the CC0 dedication. By submitting a pull request you are agreeing to comply with this waiver of copyright interest.

License

The repository utilizes code licensed under the terms of the Apache Software License and therefore is licensed under ASL v2 or later.

This source code in this repository is free: you can redistribute it and/or modify it under the terms of the Apache Software License version 2, or (at your option) any later version.

This source code in this repository is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the Apache Software License for more details.

You should have received a copy of the Apache Software License along with this program. If not, see http://www.apache.org/licenses/LICENSE-2.0.html

The source code forked from other open source projects will inherit its license.

Privacy

This repository contains only non-sensitive, publicly available data and information. All material and community participation is covered by the Surveillance Platform Disclaimer and Code of Conduct. For more information about CDC's privacy policy, please visit http://www.cdc.gov/privacy.html.

Contributing

Anyone is encouraged to contribute to the repository by forking and submitting a pull request. (If you are new to GitHub, you might start with a basic tutorial.) By contributing to this project, you grant a world-wide, royalty-free, perpetual, irrevocable, non-exclusive, transferable license to all users under the terms of the Apache Software License v2 or later.

All comments, messages, pull requests, and other submissions received through CDC including this GitHub page are subject to the Presidential Records Act and may be archived. Learn more at http://www.cdc.gov/other/privacy.html.

Records

This repository is not a source of government records, but is a copy to increase collaboration and collaborative potential. All government records will be published through the CDC web site.

Notices

Please refer to CDC's Template Repository for more information about contributing to this repository, public domain notices and disclaimers, and code of conduct.

itf_power_bi's People

Contributors

alexiacouture avatar als329 avatar beansrowning avatar blm-ylt4 avatar cubicle3253 avatar dorquiola avatar firmeza avatar imujawar avatar jamesfuller-cdc avatar jlkibler avatar kimkimroll avatar mackenzievr avatar michaelarikard avatar mkoh5 avatar mrajeev08 avatar panasci avatar plt2-cdc avatar qwt2 avatar wllmn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

itf_power_bi's Issues

Archive

Disconnect final dashboard connections that rely on this repo and archive for good.

Remove GMOB data

We're no longer using this as a data source and the pulls take forever.

CIA Data source is down, need to remove altogether

@dorquiola @als329

get_country.R#L83 expects a table from CIA world factbook which as of 1/27 doesn't exist: https://www.cia.gov/the-world-factbook/field/population/country-comparison.

image

This breaks the Internal dashboard refresh at 0_output_data.R#L31 where it's called.

Remediation

  • in 1663ff4 I pushed a patch to remove the data pull and instead read in the existing CSV data we've saved
    • (edit: See below, this is likely not sufficient because the function is also referenced internally to other functions)
  • Moving forward, get_country should be removed altogether in favor of onetable from SaviR
    • 2020 pop estimates are out of date, we use 2021 for all routine reports now
    • Web scraping from several data sources each run is unnecessary and fragile

Steps

  • Investigate impact of using onetable on downstream code
  • Implement SaviR-base approach
  • Write a test to ensure stability

Issue with new_tests_smoothed_per_thousand in FIND testing data

new_tests_smoothed_per_thousand=1000*cap_new_tests,

Please check me on this, but I believe the FIND smoothed testing data per 1000 persons is incorrectly transformed from the original data right now. The values are extremely high.

The data from FIND is per 100,000 persons: https://github.com/dsbbfinddx/FINDCov19TrackerData/blob/master/processed/codebook.csv

So the correct calculation should be:

new_tests_smoothed_per_thousand = cap_new_tests/100

or as you have it as similar to other lines:

new_tests_smoothed_per_thousand = 1000*new_tests_smoothed/(100000*pop_100k)

I can do a pull request if you prefer.
-Behzad

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.