image-segmentation's Introduction

image-segmentation

This repository includes:

A re-implementation of matterport/Mask_RCNN with multiple backbone support (with imagenet pretrained weights) using the implementations of various backbone models in qubvel/classification_models. (See here for available backbone architectures)
Unified training, inference and evaluation codes for Mask R-CNN and some semantic segmentation models (from qubvel/segmentation_models), for which you can easily modify various parameters with simple configuration file interface.
COCO dataset and KITTI dataset viewers

  [Available segmentation models]
  Instance:
    'maskrcnn'
  Semantic:
    'fpn', 'linknet', 'pspnet', 'unet'
  
  [Available backbone architectures]
  MobileNet:
    'mobilenetv2', 'mobilenet' 
  DenseNet:
    'densenet121', 'densenet169', 'densenet201'  
  ResNet:
    'resnet18', 'resnet34', 'resnet50', 'resnet101', 'resnet152'
  ResNext:
    'resnext50', 'resnext101'
  SE-Net:
    'seresnet18', 'seresnet34', 'seresnet50', 'seresnet101', 'seresnet152', 'seresnext50', 'seresnext101', 'senet154'
  Resnet V2:
    'resnet50v2', 'resnet101v2', 'resnet152v2'   
  Inception:
    'inceptionv3', 'inceptionresnetv2', 'xception'
  NASNet:
    'nasnetmobile', 'nasnetlarge'
  VGG:
    'vgg16', 'vgg19'

UNet with SeResNext101 backbone, trained on a synthetic dataset using this repository
UNet with SeResNext50 backbone, trained on only 500 images + augmentation using this repository
(left: input image, middle: ground truth, right: prediction)
FPN with ResNet18 backbone, trained on only 180 images using this repository
MaskRCNN with ResNet101 backbone, trained on COCO dataset, weights file ported from matterport/Mask_RCNN. See Custom Backbone for more details.

Installation

i. How to set up a virtual environment and install on it

  sudo apt-get install virtualenv
  virtualenv -p python3 venv
  git clone https://github.com/nearthlab/image-segmentation
  cd image-segmentation
  source activate 
  cat requirements.txt | while read p; do pip install $p; done

You should run the following commands every time you open a new terminal in order to run any of python files

  cd /path/to/image-segmentation
  source activate
  # the second line is equivalent to: 
  # source ../venv/bin/activate && export PYTHONPATH=`pwd`/image-segmentation
  # i.e. activating the virtual environment and add the image-segmentation/image-segmentation folder to the PYTHONPATH

ii. How to install without a virtual environment
Note that working on a virtual environment is highly recommended. But if you insist on not using it, you can still do so:

  git clone https://github.com/nearthlab/image-segmentation
  cd image-segmentation
  cat requirements.txt | while read p; do pip install --user $p; done

You may need to reinstall tensorflow(-gpu) if the automatically installed one is not suitable for your local environment.

Requirements

1. Python 3.5+
2. segmentation-models==0.2.0
3. keras>=2.2.0
4. keras-applications>=1.0.7 
5. tensorflow(-gpu)=>1.8.0 (tested on 1.10.0)

How to run examples

Please read the instruction written in READ.md files in each example folder

Custom Backbone
This example illustrates how to build MaskRCNN with your custom backbone architecture. In particular, I adopted matterport's implementation of ResNet, which is slightly different from qubvel's. Moreover, you can run the inference using the pretrained MaskRCNN_coco.h5. (I slightly modified the 'mask_rcnn_coco.h5' in matterport/Mask_RCNN/releases to make this example work: the only differences are layer names)
Imagenet Classification
This example shows the imagenet classification results for various backbone architectures.
Create KITTI Label
This example is a code that I used to simplify some of the object class labels in KITTI dataset. (For instance, I merged the 5 separate classes 'car', 'truck', 'bus', 'caravan' and 'trailer' into a single class called 'vehicle')
Configurations
Some example cfg files that describes the segmentation models and training processes

How to train your own FPN / LinkNet / PSPNet / UNet model on KITTI dataset

i. Download the modified KITTI dataset from release page (or make your own dataset into the same format) and place it under datasets folder.

KITTI dataset is a public dataset available online. I simply splitted the dataset into training and validation sets and simplified the labels using create_kitti_label.py.
Note that this dataset is very small containing only 180 training images and 20 validation images. If you want to train a model for a serious purpose, you should consider using much more larger dataset.
To view the KITTI dataset, run:

  python kitti_viewer.py -d=datasets/KITTI

ii. Choose your model and copy corresponding cfg files from examples/configs. For example, if you want to train a Unet model,

  cd /path/to/image-segmentation
  mkdir -p plans/unet
  cp examples/configs/unet/*.cfg plans/unet

iii. [Optional] Tune some model and training parameters in the config files that you have just copied. Read the comments in the example config files for what each parameter means. [Note that you have to declare a variable in .cfg file in the format {type}-{VARIABLE_NAME} = {value}]

iv. Run train.py:

  python train.py -s plans/unet -d datasets/KITTI \
  -m plans/unet/unet.cfg \
  -t plans/unet/train_unet_decoder.cfg plans/unet/train_unet_all.cfg

This script will train the unet model in two stages with training information in plans/unet/train_unet_decoder.cfg followed by plans/unet/train_unet_all.cfg. The idea is: we first train the decoder part only while freezing the backbone with imagenet-pretrained weights loaded, and then fine tune the entire model in the second stage. You can provide as many training cfg files as you wish, dividing training into multiple stages.

Once the training is done, you can find the three files: 'class_names.json', 'infer.cfg' and 'best_model.h5', which you can use later for the inference

v. KITTI Evaluation:

  python evaluate_kitti.py -c /path/to/infer.cfg -w /path/to/best_model.h5 -l /path/to/class_names.json

How to train your own MaskRCNN model on COCO dataset

i. Download the COCO dataset. To do this, simply run:

  cd /path/to/image-segmentation/datasets
  ./download_coco.sh

To view the COCO dataset, run:

  python coco_viewer.py -d=datasets/coco

ii. Copy the example cfg files from examples/configs/maskrcnn.

  cd /path/to/image-segmentation
  mkdir -p plans/maskrcnn
  cp examples/configs/maskrcnn/*.cfg plans/maskrcnn

iv. Run train.py:

  python train.py -s plans/maskrcnn -d datasets/coco \
  -m plans/maskrcnn/maskrcnn.cfg \
  -t plans/maskrcnn/train_maskrcnn_heads.cfg plans/maskrcnn/train_maskrcnn_stage3up.cfg plans/maskrcnn/train_maskrcnn_all.cfg

This will train your MaskRCNN model in 3 stages (heads → stage3+ → all) as suggested in matterport/Mask_RCNN) Likewise, you can find the three files: 'class_names.json', 'infer.cfg' and 'best_model.h5', which you can use later for the inference

v. COCO Evaluation:

  python evaluate_coco.py -c /path/to/infer.cfg -w /path/to/best_model.h5 -l /path/to/class_names.json

How to visualize inference

You can visualize your model's inference in a pop-up window:

python infer_gui.py -c=/path/to/infer.cfg -w=/path/to/best_model.h5 -l=/path/to/class_names.json \
-i=/path/to/a/directory/containing/image_files

or save the results as image files [This will create a directory named 'results' under the directory you provided in -i option, and write the viusalized inference images in it]:

python infer.py -c=/path/to/infer.cfg -w=/path/to/best_model.h5 -l=/path/to/class_names.json \
-i=/path/to/a/directory/containing/image_files

References

@misc{matterport_maskrcnn_2017,
  title={Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow},
  author={Waleed Abdulla},
  year={2017},
  publisher={Github},
  journal={GitHub repository},
  howpublished={\url{https://github.com/matterport/Mask_RCNN}},
}

image-segmentation's People

Contributors

Stargazers

Watchers

image-segmentation's Issues

Training does not match Matterport

I'm training a resnet50 (custom backbone/configs/maskrcnn.cfg) with a grocery products dataset.
Even after several attempts the model stagnates at a certain val_loss close to 3.5, however the same training with the same parameters using Matterport is very different (better).

In addition the weights have a relative difference, while in Nearthlab is 172MB, with Matterport is 180MB.

Given that the difference between the two frameworks in the case of resnet50 is just the name of the layers, what could be causing this problem?

Any optimization during training that modifies weights?

Because I realized that Matterpport's resnet50 doesn't work on a Jetson Nano, whereas with Nearthlab it does.

Am I doing something wrong? Should I set something else to work?

Logs:

Nearthlab :

Epoch 1/100
100/100 [==============================] - 113s 1s/step - loss: 9.2259 - val_loss: 6.9288
Epoch 2/100
100/100 [==============================] - 58s 580ms/step - loss: 7.7932 - val_loss: 5.3660
Epoch 3/100
100/100 [==============================] - 59s 590ms/step - loss: 6.9681 - val_loss: 4.7230
Epoch 4/100
100/100 [==============================] - 59s 593ms/step - loss: 6.2129 - val_loss: 4.0161
Epoch 5/100
100/100 [==============================] - 59s 587ms/step - loss: 5.5432 - val_loss: 3.9557
Epoch 6/100
100/100 [==============================] - 59s 586ms/step - loss: 5.0773 - val_loss: 3.4220
Epoch 7/100
100/100 [==============================] - 58s 582ms/step - loss: 4.5639 - val_loss: 3.5953
Epoch 8/100
100/100 [==============================] - 59s 590ms/step - loss: 4.0430 - val_loss: 3.5199
Epoch 9/100
100/100 [==============================] - 58s 582ms/step - loss: 3.8537 - val_loss: 3.5656
Epoch 10/100
100/100 [==============================] - 57s 574ms/step - loss: 3.8129 - val_loss: 3.4385
..... val_loss stagnates
Epoch 15/100
100/100 [==============================] - 58s 576ms/step - loss: 3.1963 - val_loss: 3.9125
.....
Epoch 38/100
100/100 [==============================] - 61s 610ms/step - loss: 2.8903 - val_loss: 3.3682
.....

Matterport:

Epoch 1/100
100/100 [==============================] - 112s 1s/step - loss: 3.8189 - rpn_class_loss: 0.3061 - rpn_bbox_loss: 0.8670 - mrcnn_class_loss: 1.1758 - mrcnn_bbox_loss: 0.8026 - mrcnn_mask_loss: 0.6674 - val_loss: 3.4789 - val_rpn_class_loss: 0.1830 - val_rpn_bbox_loss: 0.6418 - val_mrcnn_class_loss: 1.2534 - val_mrcnn_bbox_loss: 0.7054 - val_mrcnn_mask_loss: 0.6953
Epoch 2/100
100/100 [==============================] - 95s 945ms/step - loss: 3.3036 - rpn_class_loss: 0.1090 - rpn_bbox_loss: 0.5533 - mrcnn_class_loss: 1.2661 - mrcnn_bbox_loss: 0.6808 - mrcnn_mask_loss: 0.6943 - val_loss: 3.3427 - val_rpn_class_loss: 0.0787 - val_rpn_bbox_loss: 0.5562 - val_mrcnn_class_loss: 1.3297 - val_mrcnn_bbox_loss: 0.6842 - val_mrcnn_mask_loss: 0.6938
Epoch 3/100
100/100 [==============================] - 94s 936ms/step - loss: 3.1970 - rpn_class_loss: 0.0686 - rpn_bbox_loss: 0.4841 - mrcnn_class_loss: 1.2985 - mrcnn_bbox_loss: 0.6518 - mrcnn_mask_loss: 0.6940 - val_loss: 3.0237 - val_rpn_class_loss: 0.0571 - val_rpn_bbox_loss: 0.4428 - val_mrcnn_class_loss: 1.1865 - val_mrcnn_bbox_loss: 0.6436 - val_mrcnn_mask_loss: 0.6938
Epoch 4/100
100/100 [==============================] - 95s 947ms/step - loss: 3.1986 - rpn_class_loss: 0.0728 - rpn_bbox_loss: 0.4853 - mrcnn_class_loss: 1.3238 - mrcnn_bbox_loss: 0.6261 - mrcnn_mask_loss: 0.6905 - val_loss: 3.0983 - val_rpn_class_loss: 0.0509 - val_rpn_bbox_loss: 0.4753 - val_mrcnn_class_loss: 1.2831 - val_mrcnn_bbox_loss: 0.5961 - val_mrcnn_mask_loss: 0.6928
Epoch 5/100
100/100 [==============================] - 91s 914ms/step - loss: 3.0763 - rpn_class_loss: 0.0421 - rpn_bbox_loss: 0.4181 - mrcnn_class_loss: 1.3182 - mrcnn_bbox_loss: 0.6055 - mrcnn_mask_loss: 0.6924 - val_loss: 3.0088 - val_rpn_class_loss: 0.0330 - val_rpn_bbox_loss: 0.4035 - val_mrcnn_class_loss: 1.2702 - val_mrcnn_bbox_loss: 0.6103 - val_mrcnn_mask_loss: 0.6918
Epoch 6/100
100/100 [==============================] - 90s 902ms/step - loss: 3.0606 - rpn_class_loss: 0.0542 - rpn_bbox_loss: 0.4186 - mrcnn_class_loss: 1.3160 - mrcnn_bbox_loss: 0.5800 - mrcnn_mask_loss: 0.6919 - val_loss: 2.8770 - val_rpn_class_loss: 0.0678 - val_rpn_bbox_loss: 0.4394 - val_mrcnn_class_loss: 1.1240 - val_mrcnn_bbox_loss: 0.5563 - val_mrcnn_mask_loss: 0.6894
Epoch 7/100
100/100 [==============================] - 94s 940ms/step - loss: 2.9551 - rpn_class_loss: 0.0400 - rpn_bbox_loss: 0.4016 - mrcnn_class_loss: 1.2583 - mrcnn_bbox_loss: 0.5634 - mrcnn_mask_loss: 0.6917 - val_loss: 2.9014 - val_rpn_class_loss: 0.0453 - val_rpn_bbox_loss: 0.3620 - val_mrcnn_class_loss: 1.2578 - val_mrcnn_bbox_loss: 0.5445 - val_mrcnn_mask_loss: 0.6919
Epoch 8/100
100/100 [==============================] - 91s 908ms/step - loss: 2.9892 - rpn_class_loss: 0.0621 - rpn_bbox_loss: 0.4153 - mrcnn_class_loss: 1.2762 - mrcnn_bbox_loss: 0.5460 - mrcnn_mask_loss: 0.6895 - val_loss: 2.7237 - val_rpn_class_loss: 0.0407 - val_rpn_bbox_loss: 0.3425 - val_mrcnn_class_loss: 1.1253 - val_mrcnn_bbox_loss: 0.5290 - val_mrcnn_mask_loss: 0.6863
Epoch 9/100
100/100 [==============================] - 92s 920ms/step - loss: 2.7157 - rpn_class_loss: 0.0341 - rpn_bbox_loss: 0.3368 - mrcnn_class_loss: 1.1446 - mrcnn_bbox_loss: 0.5110 - mrcnn_mask_loss: 0.6891 - val_loss: 2.8300 - val_rpn_class_loss: 0.0302 - val_rpn_bbox_loss: 0.3824 - val_mrcnn_class_loss: 1.1554 - val_mrcnn_bbox_loss: 0.5717 - val_mrcnn_mask_loss: 0.6903
Epoch 10/100
100/100 [==============================] - 90s 903ms/step - loss: 2.8210 - rpn_class_loss: 0.0414 - rpn_bbox_loss: 0.3935 - mrcnn_class_loss: 1.1850 - mrcnn_bbox_loss: 0.5123 - mrcnn_mask_loss: 0.6889 - val_loss: 2.5782 - val_rpn_class_loss: 0.0272 - val_rpn_bbox_loss: 0.3071 - val_mrcnn_class_loss: 1.0568 - val_mrcnn_bbox_loss: 0.4993 - val_mrcnn_mask_loss: 0.6878

nearthlab / image-segmentation Goto Github PK

image-segmentation's Introduction

image-segmentation

Installation

How to run examples

How to train your own FPN / LinkNet / PSPNet / UNet model on KITTI dataset

How to train your own MaskRCNN model on COCO dataset

How to visualize inference

References

image-segmentation's People

Contributors

Stargazers

Watchers

Forkers

image-segmentation's Issues

Training does not match Matterport

segmentation_models has released the 1.0 version, please consider to upgrade the code

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent