Statistics for Data Science Project - How "the Greatest Country in the World" Suffered So Much with the COVID-19 Pandemic
Academic Poster for this project (downloadable Word file): https://liveeduisegiunl-my.sharepoint.com/:w:/g/personal/r20170796_novaims_unl_pt/ETrF87edTqVDh8sXaG1Pb2ABfcosZRWGHdDHVZDLnXS7Gg?e=X3YzVm
Grade: 16.1 out of 20
Introduction:
The main goal of this project is to understand what led the United States of America (USA), frequently called "the Greatest Country in the World", to get a lot of attention for the worst reasons: it was almost always one of the most affected countries by the COVID-19 pandemic, both in terms of cases and deaths per 1 million citizens due to the disease.
In order to understand how that happened, we developed a state-by-state analysis with Demographic, Economic, Political and Coronavirus-related variables.
Deepro et al. (2021) suggested socio-economic variables are relevant to understand the number of COVID-19 cases, as expected. Still, we also decided to add a political variable: the percentage of votes in the Republicans (Donald Trump's Party) on the 2020 Presidential Election. We did that because the implemented Coronavirus restrictions became very controversial, which led Republicans (generally more conservative people) to reject the usage of masks or disrespect the lockdowns, while some even believed the pandemic was a hoax.
On total, we began this analysis with 15 features, using the number of COVID-19 deaths per 1 million people as the dependent variable. Furthermore, we used several algorithms, such as the Linear Regression, Decision Tree, Clustering Techniques, and also some Statistical hypothesis tests, including the Shapiro-Wilk, One-way ANOVA and Welch’s t-tests.
Author: Rui Monteiro, R20170796
MSc: Data Science and Advanced Analytics - Nova IMS
Course: Statistics for Data Science
2020/2021