Git Product home page Git Product logo

525_group9's Introduction

DSCI 525 Group 9

Introduction

In this project we work on a dataset containing daily rainfall in NSW, Australia from 1889 to 2014. In addition, the dataset contains daily rainfall predictions for this region obtained from various climate models in the World Climate Research Programme Coupled Model Intercomparison Project 6 (WCRP CMIP6). The dataset is about 6 GB in size, and various memory reducing techniques are explored to efficiently handle this data. After performing a simple EDA, it was found that about 5% of the data contains missing values, that the longitude/latitude grids the climate models span are slightly different, and that the models return a diverse array of predicted distributions of rainfall for each month. These factors will have to be taken into account when modelling, which is the ultimate goal of this project. In the future, we will build a cloud-deployed ensemble model that accepts the outputs of the climate models as features to predict daily rainfall in Australia.

Contributors

  • Sukhleen Kaur (@sukhleen999)
  • Daniel King (@danfke)
  • Pavel Levchenko @plevchen)
  • Rong Li (@lirnish)

References

Data for analysis of this project is taken from https://figshare.com/articles/dataset/Daily_rainfall_over_NSW_Australia/14096681

525_group9's People

Contributors

danfke avatar lirnish avatar plevchen avatar sukhleen999 avatar

Watchers

 avatar

525_group9's Issues

Milestone 2 To-do List

Pavel:

  • Setup your EC2 instance
  • Setup your JupyterHub

Rong:

  • Setup the server

Sukhleen

  • Get the data what we wrangled in our first milestone
  • Setup your S3 bucket and move data

Daniel

  • Wrangle the data in preparation for machine learning

  • Optional Benchmarking:

    • How long did it take to read Parquet file from local disk?
    • Parquet file from S3?
    • CSV file from S3?
  • Proof-reading and submission: Sukhleen

Milestone 4

  • Develop the API
  • Deploy the API
  • Summarize the journey from milestone 1 to 4

Proof reading and submission - @danfke

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.