Git Product home page Git Product logo

tensorflow-deeplab-v3-plus's Introduction

DeepLab-v3-plus Semantic Segmentation in TensorFlow

This repo attempts to reproduce Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation (DeepLabv3+) in TensorFlow for semantic image segmentation on the PASCAL VOC dataset. The implementation is largely based on my DeepLabv3 implementation, which was originally based on DrSleep's DeepLab v2 implemantation and tensorflow models Resnet implementation.

Setup

Please install latest version of TensorFlow (r1.6) and use Python 3.

  • Download and extract PASCAL VOC training/validation data (2GB tar file), specifying the location with the --data_dir.
  • Download and extract augmented segmentation data (Thanks to DrSleep), specifying the location with --data_dir and --label_data_dir (namely, $data_dir/$label_data_dir).
  • For inference the trained model with 77.31% mIoU on the Pascal VOC 2012 validation dataset is available here. Download and extract to --model_dir.
  • For training, you need to download and extract pre-trained Resnet v2 101 model from slim specifying the location with --pre_trained_model.

Training

For training model, you first need to convert original data to the TensorFlow TFRecord format. This enables to accelerate training seep.

python create_pascal_tf_record.py --data_dir DATA_DIR \
                                  --image_data_dir IMAGE_DATA_DIR \
                                  --label_data_dir LABEL_DATA_DIR 

Once you created TFrecord for PASCAL VOC training and validation deta, you can start training model as follow:

python train.py --model_dir MODEL_DIR --pre_trained_model PRE_TRAINED_MODEL

Here, --pre_trained_model contains the pre-trained Resnet model, whereas --model_dir contains the trained DeepLabv3+ checkpoints. If --model_dir contains the valid checkpoints, the model is trained from the specified checkpoint in --model_dir.

You can see other options with the following command:

python train.py --help

The training process can be visualized with Tensor Board as follow:

tensorboard --logdir MODEL_DIR

Evaluation

To evaluate how model perform, one can use the following command:

python evaluate.py --help

The current best model build by this implementation achieves 77.31% mIoU on the Pascal VOC 2012 validation dataset.

Network Backbone train OS eval OS SC mIOU paper mIOU repo
Resnet101 16 16 78.85% 77.31%

Here, the above model was trained about 9.5 hours (with Tesla V100 and r1.6) with following parameters:

python train.py --train_epochs 43 --batch_size 15 --weight_decay 2e-4 --model_dir models/ba=15,wd=2e-4,max_iter=30k --max_iter 30000

Inference

To apply semantic segmentation to your images, one can use the following commands:

python inference.py --data_dir DATA_DIR --infer_data_list INFER_DATA_LIST --model_dir MODEL_DIR 

The trained model is available here. One can find the detailed explanation of mask such as meaning of color in DrSleep's repo.

TODO:

Pull requests are welcome.

  • Implement Decoder
  • Resnet as Network Backbone
  • Xception as Network Backbone
  • Implement depthwise separable convolutions
  • Make network more GPU memory efficient (i.e. support larger batch size)
  • Multi-GPU support
  • Channels first support (Apparently large performance boost on GPU)
  • Model pretrained on MS-COCO
  • Unit test

Acknowledgment

This repo borrows code heavily from

tensorflow-deeplab-v3-plus's People

Contributors

rishizek avatar

Watchers

James Cloos avatar Hu Ye avatar paper2code - bot avatar

Forkers

colorfulxd

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.