Git Product home page Git Product logo

classifying-leaf-disease's Introduction

Classifying Cassava Leaf Disease

This project classifies cassava leaf diseases using deep learning, leveraging image augmentation and transfer learning with CNN models.
(Midterm Project for DATA 2040 - Deep Learning @ Brown University Spring 2021)

Table of Contents

Project Description

  • Background: Cassava is a key crop for food security across Sub-Saharan Africa. Yet, viral diseases threaten cassava yields, and are costly to detect manually.
  • As a midterm group project for DATA2040, this project fine-tunes a series of CNN models to accurately detect cassava leaf diseases, using image augmentation and transfer learning to increase classification accuracy.
  • Why?: Fine-tuned deep learning models may help identify diseased cassava plants more efficiently and ultimately prevent crop loss.
  • The data consists of 21,367 labeled images of cassava leaves belonging to five different categories - four different disease categories and one category for healthy plants. The images were crowdsourced from farmers in Uganda and labeled by experts at the National Crops Resources Research Institute (NaCRRI) in collaboration with the AI lab at Makerere University, Kampala. The data and task was made available as a Kaggle competition.
  • Result: Using image augmentation and transfer learning we increased our accuracy from a baseline 61.5% (majority classifier) to 80% (DenseNet201 model).

Methods Used

  • EDA
  • Image data augmentation
  • Deep learning (CNN)
  • Transfer learning
  • Fine-tuning
  • Hyperparameter tuning

Technologies Used

  • Python (3.8)
  • TensorFlow (2.4.1)
  • Keras (2.4.0)
  • Scikit-learn (0.24.0)
  • Pandas (1.2.1)
  • Numpy (1.19.2)

Screenshots

EDA - class balance

Class balance
Cassava disease class balance (normalized). Due to the data imbalance, we used a StratifiedKFold split to preserve class ratios in each fold.

EDA - image data by category

Image data by category
Cassava leaf images by disease category. The images vary by angle, lighting, and position.

Model training - baseline VGG16 model

Baseline model
Validation and training accuracy for our baseline VGG16 model over 16 epochs.

Model training - DenseNet model

DenseNet model
Validation and training accuracy for fine-tuned DenseNet model with a ReduceLROnPlateau schedule over 31 epochs.

Setup

To read the project code as a Jupyter/IPython notebook, click on the RAD_Final_Blog_Post_3.ipynb file here to open it in your browser.

The project notebook can also be opened and run in Google Colab (with GPU). To download the Kaggle Cassava Leaf Disease Classification dataset, create a Kaggle account and create an API token. Then, replace the Kaggle username and key in the following code cell: Kaggle username and key cell

Deliverables

We described our project process - from data exploration and augmentation to transfer learning and model fine-tuning - in the following blog posts:

Contributing Members

Acknowledgements

  • This project task and data was sourced from Kaggle’s Cassava Leaf Disease Classification competition.
  • As such, many thanks to the Makerere Artificial Intelligence (AI) Lab at Makerere University in Uganda - who apply AI and data science to real-word challenges, as well as to the experts and collaborators from National Crops Resources Research Institute (NaCRRI) for assisting in preparing this dataset.
  • All of our sources for our EDA and model development are listed in our blog posts (1, 2, and 3) in the Sources Used section.
  • Many thanks to Annie and Roma for your collaboration!

classifying-leaf-disease's People

Contributors

annieptba avatar drew-solomon avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.