The goal of this project is to provide a vision for the near future development of the pandemic in NYC to reduce uncertainty. We achieve this goal by develop time series forecasting and generating models with multiple types of time series. The final product that we produce is an automatic system which can update data, models, and predictions on daily basis.
This folder contains the ARIMA analysis and other baseline model (including regression model) that is used while we were researching for this project.
This folder contains the data analysis and transformation we did on CDC data. Most of the code were reused in developing the scripts for the DAG.
This folder contains the transformed data from CDC covid-19 cases publication. It has the csv files for each state's covid cases history and vaccination rate history.
This folder contains the prototype for our D3 visualization.
In the DAG folder, there are the dag management system and the scripts that the dag system will run. After the scheduler starts, the system will collect data and store the transformed data into the script folder. It will then use the transformed data to train the VAR model and the RNN model.
Data folder contains the data we collected from NYCHealth. The scripts in DAG use the same data source.
This folder contains the jupyter notebooks of the analysis for the first and second reports.
Saved for the processed data from NYCHealth data.