Git Product home page Git Product logo

keras-fasterrcnn's Introduction

Keras-FasterRCNN

Keras implementation of Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.
cloned from https://github.com/yhenon/keras-frcnn/

UPDATE:

  • supporting inception_resnet_v2
    • for use inception_resnet_v2 in keras.application as feature extractor, create new inception_resnet_v2 model file using transfer/export_imagenet.py
    • if use original inception_resnet_v2 model as feature extractor, you can't load weight parameter on faster-rcnn

USAGE:

  • Both theano and tensorflow backends are supported. However compile times are very high in theano, and tensorflow is highly recommended.

  • train_frcnn.py can be used to train a model. To train on Pascal VOC data, simply do: python train_frcnn.py -p /path/to/pascalvoc/.

  • the Pascal VOC data set (images and annotations for bounding boxes around the classified objects) can be obtained from: http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar

  • simple_parser.py provides an alternative way to input data, using a text file. Simply provide a text file, with each line containing:

    filepath,x1,y1,x2,y2,class_name

    For example:

    /data/imgs/img_001.jpg,837,346,981,456,cow

    /data/imgs/img_002.jpg,215,312,279,391,cat

    The classes will be inferred from the file. To use the simple parser instead of the default pascal voc style parser, use the command line option -o simple. For example python train_frcnn.py -o simple -p my_data.txt.

  • Running train_frcnn.py will write weights to disk to an hdf5 file, as well as all the setting of the training run to a pickle file. These settings can then be loaded by test_frcnn.py for any testing.

  • test_frcnn.py can be used to perform inference, given pretrained weights and a config file. Specify a path to the folder containing images: python test_frcnn.py -p /path/to/test_data/

  • Data augmentation can be applied by specifying --hf for horizontal flips, --vf for vertical flips and --rot for 90 degree rotations

NOTES:

  • config.py contains all settings for the train or test run. The default settings match those in the original Faster-RCNN paper. The anchor box sizes are [128, 256, 512] and the ratios are [1:1, 1:2, 2:1].
  • The theano backend by default uses a 7x7 pooling region, instead of 14x14 as in the frcnn paper. This cuts down compiling time slightly.
  • The tensorflow backend performs a resize on the pooling region, instead of max pooling. This is much more efficient and has little impact on results.

Example output:

ex1 ex2 ex3 ex4

ISSUES:

  • If you get this error: ValueError: There is a negative shape in the graph!
    than update keras to the newest version

  • Make sure to use python2, not python3. If you get this error: TypeError: unorderable types: dict() < dict() you are using python3

  • If you run out of memory, try reducing the number of ROIs that are processed simultaneously. Try passing a lower -n to train_frcnn.py. Alternatively, try reducing the image size from the default value of 600 (this setting is found in config.py.

Reference

[1] Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, 2015
[2] Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning, 2016
[3] https://github.com/yhenon/keras-frcnn/

keras-fasterrcnn's People

Contributors

collinarnett avatar you359 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.