Git Product home page Git Product logo

amazon-from-space's Introduction

Machine Learning Engineer Nanodegree Capstone Project

Understanding the Amazon from Space

Overview

The goal of this project is to track changes in the Amazon rainforest due to deforestation using satellite image data. It is a multi-label image classification problem introduced in a past Kaggle competition Planet: Understanding the Amazon from Space. The final solution consists of an ensemble of five deep learning models using transfer learning with the fastai library which would have placed 19th in the competition.

You can read my full project report here

You can see my original project proposal here

You can find the blog post of my project workflow here

Requirements

To run the notebooks you can create a conda environment with all necessary software by running:

conda env create -f environment.yml

Alternatively, you can install the libraries listed in requirements.txt however you see fit.

Data

The dataset consists of 40,479 training images and 61,191 test images, and each image is a 256x256 pixel jpeg. The filetrain_v2.csv is included which liststhe training file names and their accompanying labels. The labels can be broken down into three categories: cloud coverlabels, common labels, and less common labels. There are 17 labels in total: clear, partly cloudy, cloudy, haze, primary, water, habitation, agriculture, road, cultivation, bare ground, slash and burn, selective logging, blooming, conventional mining, artisanal mining, and blow down.

The data and all models I used to produce my final solution can be obtained by running the get_data.ipynb notebook.

Usage

Getting the data

To reproduce my solution by training from scratch, you first must download the data by running the get_data.ipynb notebook.

Training the models

Then you can proceed to train the models by running the train.ipynb notebook. You will need to run this notebook five times, once for each of the following fastai models: resnet50, resnet101, resnet152, densenet121, densenet169. You can change which architecture you are using by changing the line in cell [8] to whichever pretrained model you wish to train, for example:

arch = models.resnet152

Creating an ensemble

To create an ensemble of the models you wish to include, you can run the create_ensemble.ipynb notebook. Simply define the model_list in cell [4] to include the models you wish like so:

model_list = ['resnet50.pkl','resnet101.pkl', 'resnet152.pkl', 'densenet121.pkl', 'densenet169.pkl']

Making a submission

To make predictions and submit to the Kaggle competition to see your score, run the notebook predict_and_submit.ipynb. To submit directly from the notebook you will need the Kaggle command line tool installed. Directions on how to set this up can be found in the get_kaggle_data.ipynb notebook. Alternatively, you can manually submit to the competition by navigating to the competition page and select the .csv file you'd like to submit.

Note that in either case, you will need to accept the competitions terms before being able to submit to the competition.

amazon-from-space's People

Contributors

ncondo avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.