Using big data to predict daily Australian rainfall!
Our goal is to develop and deploy cloud-based ensemble machine learning models to predict daily Australian rainfall. The data we are using consists of modelled and observed daily rainfall data over NSW, Australia between the years 1889 to 2014, originally accessed from the figshare platform. The modelled data has been kindly provided by CMIP6, an international collaboration of climate model outputs from different groups around the world. We will be gathering, processing, and deploying the consolidated outputs of separate climate models into a big data machine learning application predicting future target rainfall measurements. The final model will be deployed for others to use in their own analyses!
- Clone the GitHub repository
- From project root directory, navigate to notebooks folder and open rainfall_analysis.ipynb
- Click on the
Run
menu and then click onRun All Cells
* R
* Python
* pandas
* rpy2
* dask
* pyarrow
* dplyr
Please note that this notebook is resource intensive and may not run on some machines
Group 6 Members:
- Kangbo Lu - @KangboLu
- Craig McLaughlin - @cmmclaug
- Debananda Sarkar - @debanandasarkar
- Kevin Shahnazari - @kshahnazari1998
MDS DSCI 525 Instructor Gittu George - @ggeorg02
Data compiled by MDS Instructor Tom Beuzen - @tbeuzen
Modelled data provided by CMIP6: https://www.wcrp-climate.org/wgcm-cmip/wgcm-cmip6
Data is supported by the Pangeo project: https://pangeo-data.github.io/pangeo-cmip6-cloud/
Observed data is supplied by the Australian SILO database: https://www.longpaddock.qld.gov.au/silo/