Git Product home page Git Product logo

semantic_segmentation's Introduction

Semantic Segmentation using a Fully Convolutional Neural Network

Introduction

This repository contains a set of python scripts to train and test semantic segmentation using a fully convolutional neural network. The semantic segmentation network is based on the paper described by Jonathan Long et al.

How to Train the Model

  1. Since the network uses VGG-16 weights, first, you have to download VGG-16 pre-trained weights from https://www.cs.toronto.edu/~frossard/vgg16/vgg16_weights.npz and save in the the pretrained_weights folder.
  2. Download KITTI dataset and save it in the data/data_road folder.
  3. Next, open a command window and type python fcn.py and hit the enter key.

Please note that training checkpointing will be saved to checkpoints/kitti folder and logs will be saved to graphs/kitti folder. So by using tensorboard --logdir=graphs/kitti command, you can start tensorboard to inspect the training process.

Following images show sample output we obtained with the trained model.

img_1 img_1 img_1 img_1

Network Architecture

We implement the FCN-8s model described in the paper by Jonathan Long et al. Following figure shows the architecture of the network. We generated this figure using TensorBoard.

architecture

Additionally, we would like to describe main functionalities of the python scripts of this repository in the following table.

Script Description
fcn.py This is the main script of the repository. The key methods of this script are:build, optimize and inference. The build method load pre-trained weights and build the network. The optimize method does the training and inference is used for testing with new images.
loss.py The script contains the loss function we optimize during the training.
helper.py This script contains some useful utility function for generating training and testing batches.
model_utils.py This script contains some useful utility functions to building fully convolutional network using VGG-16 pre-trained weights.

The KITTI dataset

For training the semantic segmentation network, we used the KITTI dataset. The dataset consists of 289 training and 290 test images. It contains three different categories of road scenes:

  • uu - urban unmarked (98/100)
  • um - urban marked (95/96)
  • umm - urban multiple marked lanes (96/94)

Training the Model

When it comes to training any deep learning algorithm, selecting suitable hyper-parameters play a big role. For this project, we carefully select following hyper-parameters

Parameter Value Description
Learning Rate 1e-5 We used Adam optimizer and normally 1e-3 or 1e-4 is the suggested learning rate. However, when we were experimenting with different learning rates we found out that 1e-5 works better than above values.
Number of epochs 25 The training dataset is not too big and it has only 289 training examples. Hence, we use a moderate number of epochs.
Batch Size 8 Based on the size of the training dataset, we selected batch size of 8 images.

The following image shows how the training loss changes when we train the model. loss_graph

Conclusion

In this project, we investigated how to use a fully convolutional neural network for semantic segmentation. We tested our model against KITTI dataset. The results indicate that our model is quite capable of separating road pixels form the rest. However, we would like to work on following additional ta to increase the accuracy of our model.

  1. Data Augmentation: During our testing, we have found that our mode failed to label road surface when inadequate lighting in the environment. We think data augmentation can be used to generate more training examples with different lighting conditions. So additional data generated using data augmentation will help us to overcome the above-mentioned issue.

semantic_segmentation's People

Contributors

upul avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

semantic_segmentation's Issues

train

How long will it take to train

Fully Connected layer

Is every input node multiplied with the same filter values or if each input node is multiplied with different sets of filter values?

Training with Customised Dataset

Hello,

Is it possible to use this training model for semantically segmenting images from a custom made dataset where we have RGB Image and a ground truth data alongside with say 12 classes?

It'll be really helpful to receive responses.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.