Git Product home page Git Product logo

vaers's Introduction

This is a copy of the Vaccination Adverse Event Reporting System (VAERS) database available from https://vaers.hhs.gov/. I've included all the data files in this repo (w/ git LFS) to facilitate downloading and exploring the data. I will try to keep it up-to-date, but you can always use the above link to get newer data.

I've included the VAERS Data User Guide which includes an explanation of the dataset, the format of the files and fields and the abbreviations used. It also includes the appropriate disclaimers about the data and it's accuracy. The system is voluntary and forms can be submitted by lay-persons. This causes a few problems:

  • The data can be inaccurate - there are records that mistakenly have an ONSET_DATE before the VAX_DATE, or that there exists a VAX_DATE and an ONSET_DATE but no NUMDAYS data.
  • lack important details - there are 80k records without a VAX_DATE, the descriptions are often vague
  • under-reporting - it's estimated that only 1-10% of the cases are reported to VAERS.

I am importing the data into Elasticsearch (6.x) and visualizing with Grafana (6.x). I also explore the data with Kibana, but I find the dashboards of Grafana more flexible to work with. But there are visualization and queries (like Significant Terms) that I cannot do in Grafana (vote for grafana/grafana#3163) so I will use Kibana for those.

I currently apply a few corrections/modifications to the data:

  • In a case where there is no NUMDAYS field but there is a VAX_DATE and ONSET_DATE, I calculate the NUMDAYS
  • There were also cases where the ONSET_DATE was before the VAX_DATE - I assumed this was a simple mistake and swap the dates
  • I create a VAX_COMBOS field of vaccination combinations (eg, DTAPIPV::FLU4 or FLU3::MMRV)
  • I create a "REACTIONS" field which contains boolean fields about the outcome (DIED, ER_VISIT, etc.)
  • I create a shorted version of the text fields to help in making dashboards without scripted fields

GOALS: To facilitate a data-based discussion about the benefits and risks of vaccinations. In my professional life, I help organizations process and analyze datasets to make Data-Driven/Led decisions. I wanted to use some of the same tools I use to approach this discussion.

TODO:

  • more data validation
    • confirm VAX_DATE, ONSET_DATE and NUMDAYS agree
    • NLP to extract patient age, dates, and other relevant information from text fields
    • NLP to identify reports that might be less reliable
  • import the data into Postgres and/or SQLite for visualization with Superset and other tools
  • visualize/analyze the data using R/Python
  • (possibly) convert scripts to NodeJS/Python
  • import more datasets
    • Demographics and vaccination coverage to approximate adverse reaction rates
    • Non-US datasets (WHO, others)
    • Vaccine costs and health care costs of the diseases
  • Make a docker container and/or ansible script to facilitate getting the data up and running for others.
  • more...

SCREENSHOTS:

VAERS ES-Grafana

For more suggestions or more information, you can contact me, yehosef, at gmail.

Yehosef Shapiro

vaers's People

Contributors

yehosef avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.