Git Product home page Git Product logo

keras-retinanet's Introduction

This version of RetinaNet is inspired by the one of fizyr's github and has been modified to be trained on polarimetric images.

The installation steps are the same than fizyr's github, and you can refer to their README for training and testing on other datasets.

Installation

  1. Clone this repository.
  2. In the repository, execute pip install . --user. Note that due to inconsistencies with how tensorflow should be installed, this package does not define a dependency on tensorflow as it will try to install that (which at least on Arch Linux results in an incorrect installation). Please make sure tensorflow is installed as per your systems requirements.
  3. Alternatively, you can run the code directly from the cloned repository, however you need to run python setup.py build_ext --inplace to compile Cython code first.
  4. Optionally, install pycocotools if you want to train / test on the MS COCO dataset by running pip install --user git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI.

Testing

An example of testing the network can be seen in this Notebook or with this code In general, inference of the network works as follows:

boxes, scores, labels = model.predict_on_batch(inputs)

Where boxes are shaped (None, None, 4) (for (x1, y1, x2, y2)), scores is shaped (None, None) (classification score) and labels is shaped (None, None) (label corresponding to the score). In all three outputs, the first dimension represents the shape and the second dimension indexes the list of detections.

Loading models can be done in the following manner:

from keras_retinanet.models import load_model
model = load_model('/path/to/model.h5', backbone_name='resnet50')

Converting a training model to inference model

The training procedure of keras-retinanet works with training models. These are stripped down versions compared to the inference model and only contains the layers necessary for training (regression and classification values). If you wish to do inference on a model (perform object detection on an image), you need to convert the trained model to an inference model. This is done as follows:

# Running directly from the repository:
keras_retinanet/bin/convert_model.py /path/to/training/model.h5 /path/to/save/inference/model.h5

# Using the installed script:
retinanet-convert-model /path/to/training/model.h5 /path/to/save/inference/model.h5

Most scripts (like retinanet-evaluate) also support converting on the fly, using the --convert-model argument.

Training

keras-retinanet can be trained using this script. Note that the train script uses relative imports since it is inside the keras_retinanet package. If you want to adjust the script for your own use outside of this repository, you will need to switch it to use absolute imports.

If you installed keras-retinanet correctly, the train script will be installed as retinanet-train. However, if you make local modifications to the keras-retinanet repository, you should run the script directly from the repository. That will ensure that your local changes will be used by the train script.

The default backbone is resnet50. You can change this using the --backbone=xxx argument in the running script. xxx can be one of the backbones in resnet models (resnet50, resnet101, resnet152), mobilenet models (mobilenet128_1.0, mobilenet128_0.75, mobilenet160_1.0, etc), densenet models or vgg models. The different options are defined by each model in their corresponding python scripts (resnet.py, mobilenet.py, etc).

Trained models can't be used directly for inference. To convert a trained model to an inference model, check here.

Usage

The pretrained MS COCO model can be downloaded here. Results using the cocoapi are shown below (note: according to the paper, this configuration should achieve a mAP of 0.357).

No fusion

To train on Polar dataset

# Running directly from the repository:
python keras_retinanet/bin/train.py --epochs number_of_epoch --batch-size batch_size --steps number_of_steps_per_epoch --weights /path/to/weights/for/fine/tuning --snapshot-path /path/to/save/snapshots pascal /path/to/dataset/main/folder/ /relative/path/to/the/train/images /relative/path/to/the/train/labels /relative/path/to/the/val/images /relative/path/to/the/val/labels

To evaluate on Polar dataset

# Running directly from the repository:
python keras_retinanet/bin/evaluate.py pascal /path/to/dataset/main/folder/ /relative/path/to/the/test/folder/from/dataset/repository /relative/path/to/the/test/labels/folder/from/dataset/repository /path/to/weights  (--convert-model if needed)

The pretrained Polar models can be downloaded here.

Fusion

This implementation enables to perform early and late fusion between polarimetric and color data.

Early fusion

To train the network with early fusion between two three-channels images:

# Running directly from the repository:
python keras_retinanet/bin/train.py --epochs number_of_epoch --batch-size batch_size --steps number_of_steps_per_epoch --backbone fusion_backbone --weights /path/to/weights/for/fine/tuning --snapshot-path /path/to/save/snapshots pascal-early-fusion /path/to/dataset/main/folder/ /relative/path/to/train/modality1 /relative/path/to/train/modality2/relative/path/to/train/labels /relative/path/to/val/modality1 /relative/path/to/val/modality2 /relative/path/to/val/labels

Note that to achieve this early fusion scheme, i. e. processing a seven-channels image, you must use one of the following backbones: resnet50-multi, resnet101-multi or resnet152-multi.

To evaluate the network with early fusion between two three-channels images:

# Running directly from the repository:
python keras_retinanet/bin/evaluate.py pascal-early-fusion /path/to/dataset/main/folder/ /relative/path/to/test/modality1 /relative/path/to/test/modality2/relative/path/to/test/labels /path/to/weights (--convert-model if needed)

Late fusion

For the late fusion scheme, two models trained on three channels images are used and evaluated according to a well chosen filter.

Before evaluating the models, the two RetinaNet networks must have different layer names to avoid conflicts when loading the weights. The script to rename the weights of RetinaNet50, RetinaNet101 and RetinaNet152 can be foud here

Evaluating with desired filter:

# Running directly from the repository:
python keras_retinanet/bin/evaluate.py pascal-late-fusion /path/to/dataset/main/folder/ /relative/path/to/test/modality1 /relative/path/to/test/modality2 /relative/path/to/test/labels /path/to/first/model/weights --model2=/path/to/second/model/weights (or --model-multimodal if non-stackable polar and RGB images) --filter-style=desired_filter

Note that if images are stackable pixelwise, the option --model2 will be used for second model. If the predicted bounding boxes for color modality (such as RGB) need to be registered towards the polarimetric ones, the option --model-multimodal will be used for second model, the path of color modality weights will be associated to that option.

The available filters ( --filter-style options) are:

  1. Naive NMS filter: --filter-style=naive-fusion and set soft_nms_sigma to 0 (here, line 62)
  2. Naive soft-NMS filter: --filter-style=naive-fusion and set soft_nms_sigma to a value greater than 0 (here, line 62)
  3. Double soft-NMS filter: --filter-style=soft-nms and set soft_nms_sigma to a value greater than 0 (here, line 62)
  4. Or filter: --filter-style=or-filter
  5. AND filter: --filter-style=and-filter

Laplacian pyramids fusion

For this fusion scheme, the two modalities are fused as a pre-processing, following the Laplacian pyramid fusion presented here.

To train using Laplacian pyramid fusion:

# Running directly from the repository:
python keras_retinanet/bin/train.py --epochs number_of_epoch --batch-size batch_size --steps number_of_steps_per_epoch --weights /path/to/weights/for/fine/tuning --snapshot-path /path/to/save/snapshots pascal /path/to/dataset/main/folder/ pascal-pyramid /path/to/dataset/main/folder/ /relative/path/to/train/modality1 /relative/path/to/train/modality2/relative/path/to/train/labels /relative/path/to/val/modality1 /relative/path/to/val/modality2 /relative/path/to/val/labels

To evaluate using Laplacian pyramid fusion:

# Running directly from the repository:
python keras_retinanet/bin/evaluate.py pascal-pyramid /path/to/dataset/main/folder/ /relative/path/to/test/modality1 /relative/path/to/test/modality2 /relative/path/to/test/labels /path/to/model/weights (--convert-model if needed)

Results

Example output images using keras-retinanet are shown below.

Results

On the right results of detection on (I0, I45, I135), in the center, results of detection on (S0, S1, S2) and on the left, results of detection on (I0, AOP, DOP).

Notes

  • This repository requires Keras 2.2.0 or higher.
  • This repository is tested using OpenCV 3.4.
  • This repository is tested using Python 3.6.

keras-retinanet's People

Contributors

hgaiser avatar de-vri-es avatar rachelblin avatar lvaleriu avatar vcarpani avatar awilliamson avatar vidosits avatar yhenon avatar jjiunlin avatar cgratie avatar yecharlie avatar mihaimorariu avatar rodrigo2019 avatar bw4sz avatar apacha avatar enricoliscio avatar pedroconceicao avatar muhannes avatar dshahrokhian avatar fangwudi avatar mbnigis avatar wassname avatar adreo00 avatar ei-grad avatar etienne-meunier avatar baek-jinoo avatar iver56 avatar mxvs avatar peacherwu avatar ori226 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.