Git Product home page Git Product logo

sayem-eee-kuet / covid-19-usa-social-vulnerability Goto Github PK

View Code? Open in Web Editor NEW

This project forked from muellermax/covid-19-usa-social-vulnerability

0.0 0.0 0.0 36.04 MB

Investigate correlations between Covid-19 confirmed cases/deaths and selected social vulnerability indicators in the USA

Home Page: https://medium.com/@muellermax1985/how-does-covid-19-affect-social-vulnerable-populations-in-the-us-11b1d9109876

Jupyter Notebook 100.00%

covid-19-usa-social-vulnerability's Introduction

Social Vulnerability Index (SVI) and Covid-19 in the USA

A data based approach to how Covid-19 affects specific groups in the USA.

Installation

The jupyter notebook is based on Python 3.7 and relies mainly on pandas, numpy and datetime for the data preparation. For plotting and visualizing I used mainly Seaborn and for Chloropleth maps plotly. To export the Pandas style tables, I installed imgkit. As I have also build a simple Multiple Linear Regression Model, the notebook needs Scikit-learn as well.

Project motivation

There are a few points to mention here:

  • It is part of my Data Scientist at Udacity to write a blogpost at Medium about a topic of my fancy. At the same time, a colleague showed me the "Uncover Covid-19 Challenge" on Kaggle. That is how I found all those datasets about Covid-19 on Kaggle.
  • Having already read a lot about the implications of Covid-19 especially for poorer or - generally speaking - socially more vulnerable societies, I was at once intrigued by the CDC Social Vulnerability Index.

Therefore I decided to focus my investigation on the impact of Covid-19 in the USA and especially in the US counties. My main questions were:

  1. Which US-counties are most affected by Covid-19 regarding infections and deaths?
  2. Is there a correlation between specific social vulnerability indicators and Covid-19 cases as well as deaths?
  3. Is it possible to build a simple Linear Regression Model that predicts Covid-19 cases and deaths based on specific social vulnerability indicators?

The Social Vulnerability Index (SVI)

CDC writes in the documentation for the SVI:

The degree to which a community exhibits certain social conditions, including high poverty, low percentage of vehicle access, or crowded households, may affect that community’s ability to prevent human suffering and financial loss in the event of disaster. These factors describe a community’s social vulnerability.

ATSDR’s Geospatial Research, Analysis & Services Program (GRASP) created Centers for Disease Control and Prevention Social Vulnerability Index (CDC SVI or simply SVI, hereafter) to help public health officials and emergency response planners identify and map the communities that will most likely need support before, during, and after a hazardous event.

Examples of social vulnerability indicators are Poverty, Age Over 65, Minority, Speaks English "less than well", No High School Diploma, Single-Parent Households, Mobile Homes, No Vehicle etc. Furthermore, there is a overall ranking indicator that takes all indicators into account.

The documentation can be found here: https://svi.cdc.gov/data-and-tools-download.html

Findings

I have written a blogpost on Medium that can be accessed on Medium.

File description

  • Covid-19 SVI: jupyter notebook that contains all code and results.
  • confirmed-covid-19-cases-in-us-by-state-and-county: A csv file about all confirmed Covid-19 cases in US states and counties as of April 8th. This file is part of Kaggles "Uncover Covid-19 Challenge".
  • confirmed-covid-19-deaths-in-us-by-state-and-county: A csv file about all confirmed Covid-19 caused deaths in US states and counties as of April 8th. This file is part of Kaggles "Uncover Covid-19 Challenge".
  • SVI2018_US: A csv file with the newest social vulnerability data for US states and counties as of 2018. This file I downloaded from CDC's website: https://svi.cdc.gov/
  • SVI2018Documentation: A pdf that explains the Social Vulnerability Index in general and the different sub-indicators in the csv. The pdf can be found here: https://svi.cdc.gov/data-and-tools-download.html
  • Media for Medium post: I wrote a small blogpost on Medium and had to generate a few pictures from the plots aswell. This can be found in this folder.

How to interact

Every contribution is welcome.

There is always the possibility to look deeper into the provided data. In my research, I have included only 14 indicators that were all percentage values and found already some interesting correlations. I have also plotted some maps of the US counties (using plotly) however I didnt include them in the notebook as they were not necessary for my results.

Further investigation could - in my opinion - focus on more social vulnerability aspects as well as on the absolute values. Also a very importing indicator is missing here: Gender. As far as I know, Covid-19 affects all genders, however in social vulnerable contexts gender could make a difference.

Finally, my Multiple Linear Regressions Models did not show very high scores. Maybe there is a few to improve the prediction even more, e.g. using other or more features or another model.

Acknowledgements

Thanks to Kaggle and the Roche Data Science Coalition for providing the datasets and supporting thus the fight against Covid-19. The challenge can be found here: https://www.kaggle.com/roche-data-science-coalition/uncover Thanks to Agency for Toxic Substances and Disease Registry (ATSDR) and Centers for Disease Control and Prevention (CDC) for providing data and documentation about social vulnerability in the USA. All information can be found here: https://svi.cdc.gov/index.html Thanks to Udacity, Codecademy and Stackoverflow for allways providing answers to my questions.

Author

Maximilian Müller, Business Development Manager in the Renewable Energy sector. Now diving into the field of data analysis.

GitHub repository

Link to GitHub respository: https://github.com/muellermax/Covid-19-social-vulnerability

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.