Git Product home page Git Product logo

ossp's Introduction

OSSP

Open Source Sea-ice Processing

Open Source Algorithm for Detecting Sea Ice Surface Features in High Resolution Optical Imagery

Nicholas Wright and Christopher Polashenski

Introduction

Welcome to OSSP; a set of tools for detecting surface features in high resolution optical imagery of sea ice. The primary focus is on the detection of and differentiation between open water, melt ponds, and snow/ice.

The Anaconda distribution of Python is recommended, but any distribution with the appropriate packages will work. You can download Anaconda, version 2.7, here: https://www.continuum.io/downloads

Dependencies

  • Python 2.7
  • gdal (v2.0 or above)
  • numpy
  • scipy
  • h5py
  • skimage
  • sklearn
  • matplotlib
  • Tkinter

Usage

For detailed usage and installation instructions, see the pdf document 'Algorithm_Instructions.pdf'

batch_process_mp.py

This combines all steps of the image classification scheme into one script. This script finds all appropriately formatted files in the input directory (.tif and .jpg) and queues them for processing. For each image, this script processes them as follows: Image Subdivision (Splitter.py) -> Segmentation (Watershed.py) -> Classification (RandomForest.py) -> Calculate statistics -> Recompile classified splits. batch_process_mp.py is able to utilize more than one core of the processor for the segmentation and classification phases.

Required Arguments

  • input directory: directory containing all of the images you wish to process Note that all .jpg and .tif images in the input directory as well as all sub-directories of it will be processed.
  • image type: {‘srgb’, ‘wv02_ms’, ‘pan'}: the type of imagery you are processing.
    1. 'srgb': RGB imagery taken by a typical camera
    2. 'wv02_ms': DigitalGlobe WorldView 2 multispectral imagery
    3. 'pan': High resolution panchromatic imagery
  • training dataset file: complete filepath of the training dataset you wish to use to analyze the input imagery

Optional Arguments

  • -s | --splits: The number of times to split the input image for improved processing speed. This is rounded to the nearest square number. Default = 9.
  • -p | --parallel: The number of parallel processes to run (i.e. number of cpu cores to utilize). Default = 1.
  • --training_label: The label of a custom training dataset. See advanced section for details. Default = image_type.

Notes:

Example: batch_process_mp.py input_dir im_type training_dataset_file -s 4 -p 2

This example will process all .tif and .jpg files in the input directory, using the training data found in training_dataset_file using two processors, and splitting the image into four sections

In general, images should be divided into parts small enough to easily load into RAM. This depends strongly on the computer running these scripts. Segments should typically not exceed 1 or 2gb in size for best results. For the 5-7mb RGB images provided as test subjects, subdivision is not required (use –s 1 as the optional argument). Processing speed can be increased by combining subdivision with multiple cores. For a full multispectral WorldView scene, which may be 16gb or larger, 9 or 16 segments are typically needed. The number of parallel processes to run should be selected such that num_cores * subdivision filesize << available system ram.

Splitter.py

This script reads in a raw image, stretches the pixel intensity values to the full 8-bit range, and subdivides the image into s number of subimages. The output file is in hdf5 format, and is ready to be ready by Watershed.py.

Positional Arguments

  • input_dir: Directory path of the input image
  • filename: Name of the image to be split
  • image type: {‘srgb’, ‘wv02_ms’, ‘pan'}: the type of imagery you are processing.
    1. 'srgb': RGB imagery taken by a typical camera
    2. 'wv02_ms': DigitalGlobe WorldView 2 multispectral imagery,
    3. 'pan': High resolution panchromatic imagery

Optional Arguments

  • --output_dir: Directory path for output images.
  • -s | --splits: The number of times to split the input image for improved processing speed. This is rounded to the nearest square number. Default = 9.
  • -v | --verbose: Display text information and progress of the script.

Watershed.py

This script loads the output of Splitter.py, and segments the image using an edge detection followed by watershed segmentation.

Positional Arguments

  • input_dir: Directory path of the input image.
  • filename: Name of the segmented image file (Splitter output: .h5).

Optional Arguments

  • --output_dir: Directory path for output files.
  • --histogram: Display histogram of pixel intensity values before segmentation.
  • -c | --color: Save a color (rgb) version of the input image.
  • -t | --test: Inspect segmentation results results before saving output files.
  • -v | --verbose: Display text information and progress of the script.

RandomForest.py

Classified the segmented image (output of Watershed.py) using a Random Forest machine learning algorithm. Training data can be created on a segmented image using the GUI in training_gui.py.

Positional Arguments

  • input_filename: Directory and filename of image watersheds to be classified.
  • training_dataset: Directory and filename of the training dataset (.h5)
  • training_label: name of training classification list

Optional Arguments

  • -q | --quality: Display a quality assessment (OOB score and attribute importance) of the training dataset.
  • --debug: Display one classified subimage at a time, with the option of quitting after each.
  • -v | --verbose: Display text information and progress of the script.

training_gui.py

Positional Arguments:

  • input: In mode 1 this is a folder containing the training images. In mode 2, the input is a classified image file (.h5).
  • image type: {‘srgb’, ‘wv02_ms’, ‘pan'}: the type of imagery you are processing.
    1. 'srgb': RGB imagery taken by a typical camera
    2. 'wv02_ms': DigitalGlobe WorldView 2 multispectral imagery,
    3. 'pan': High resolution panchromatic imagery
  • -m | --mode: {1,2}. How you would like to the training GUI. 1: create a training dataset from folder of raw images. 2: assess the accuracy of a classified image (output of RandomForest.py).

Optional arguments:

  • --tds_file: Only used for mode 1. Existing training dataset file. Will create a new one with this name if none exists. Default = <image_type>_training_data.h5.
  • -s | --splits: The number of times to split the input image for improved processing speed. This is rounded to the nearest square number. Default = 9.

Contact

Nicholas Wright

ossp's People

Contributors

wrightni avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.