Git Product home page Git Product logo

predicting-drought's Introduction

🌧️ Predicting Drought ☀️

WhyInstallationBenchmarksLicense

This repository is a code submission for the Master's course 'Machine Learning' at the University of Hamburg.

It builds upon a kaggle challenge published in 2021. The goal is to predict drought scores for the 6 next weeks using meteorological, soil and drought data from the last 180 days.

More specifically, it explores the usage of MiniROCKET-transformed features and different classification and regression techniques and performs analyses on feature and history importance.

🤔 Why Predict Droughts?

Droughts are becoming more severe and more frequent.

Even though the prediction of droughts does not prevent them or make them less harmful, it allows countries, cities or individuals to adjust and take preventive measures when needed.

The prediction of droughts may also allow us to better understand influencing factors of droughts.

⚙️ Installation

32 GB of RAM are recommended if you plan to run this notebook again yourself. If you don't have enought RAM, creating a swap file might be a possible solution.

First, you have to download the data from https://www.kaggle.com/cdminix/us-drought-meteorological-data.

Simply place the four files test_timeseries.csv, train_timeseries.csv, validation_timeseries.csv and soil_data.csv directly into the main directory.

Next, run the jupyter notebook by navigating into the directory in your terminal and typing

jupyter notebook

That's it! 🎉 You are ready to go. All relevant requirements will be installed with pip in the notebook itself.

Executing all cells of the notebook again will take several hours.

Javascript widgets are used to display colored progress bars. If you don't have javascript widgets enabled in Jupyter and encounter the error widget javascript not detected. it may not be installed or enabled properly., try running the following commands before starting the notebook:

sudo jupyter nbextension enable --py --sys-prefix widgetsnbextension
jupyter nbextension enable --py widgetsnbextension

📊 Benchmarks

The following performance benchmarks were either given or achieved by our own experiments:

Model Macro F1 Mean MAE Mean
Random guessing 0.108 2.244
Majority class 0.133 0.578
Random guessing (stratified) 0.164 1.046
Ridge regression (MiniROCKET features) 0.444 0.372
Ridge regression (default features) 0.579 0.255
LSTM (@MiniXC) 0.639 0.277

Status: August 2021

An up-to-date benchmark list of all handed in models can be found in the original kaggle task: https://www.kaggle.com/cdminix/us-drought-meteorological-data/tasks?taskId=3422.

The confusion matrices of the best-performing regressors per feature type developed in this notebook lare shown below.

Confusion matrix of ridge regression using MiniROCKET-transformed features:

confusion matrix ridge regression + MiniROCKET

Confusion matrix of ridge regression using default features:

confusion matrix ridge regression + default features

⚠️ License

This repository has been published under the MIT license.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.