Git Product home page Git Product logo

caffe-dssd's Introduction

DSSD : Deconvolutional Single Shot Detector

License

By Cheng-Yang Fu*, Wei Liu*, Ananth Ranga, Ambrish Tyagi, Alexander C. Berg.

*=Equal Contribution

Status now

The first version is done. Users can start training the SSD/DSSD with Resnet-101 now.

Stay tuned. Models, trained for Pascal VOC 2007, 2012 and COCO , will be released soon.

Introduction

Deconvolutional SSD brings additional context into state-of-the-art general object detection by adding extra deconvolution structures. The DSSD achieves much better accuracy on small objects compared to SSD.

The code is based on SSD. For more details, please refer to our arXiv paper.

Citing DSSD

Please cite DSSD in your publications if it helps your research:

@inproceedings{Fu2016dssd,
  title = {{DSSD}: Deconvolutional Single Shot Detector},
  author = {Fu, Cheng-Yang and Liu, Wei and Ranga, Ananth and Tyagi, Ambrish and Berg, Alexander C.},
  booktitle = {arXiv preprint arXiv:1701.06659},
}

Contents

  1. Installation
  2. Preparation
  3. Train/Eval
  4. COCO_Models

Installation

  1. Download the code from github. We call this directory as $CAFFE_ROOT later.

    git clone https://github.com/chengyangfu/caffe.git
    cd $CAFFE_ROOT
    git checkout dssd
  2. Build the code. Please follow Caffe instruction to install all necessary packages and build it.

    # Modify Makefile.config according to your Caffe installation.
    cp Makefile.config.example Makefile.config
    make -j8
    # Make sure to include $CAFFE_ROOT/python to your PYTHONPATH.
    make py
    make test -j8
    # (Optional)
    make runtest -j

Preparation

  1. Please Follow the Orginal SSD to do all the preparation works. You should have lmdb fils for VOC2007. Check the following two links exist or not.

    ls $CAFFE_ROOT/examples
    # $CAFFE_ROOT/examples/VOC0712/VOC0712_trainval_lmdb
    # $CAFFE_ROOT/examples/VOC0712/VOC0712_test_lmdb
  2. Download the Resnet-101 models from the Deep-Residual-Network.

    # creat the directory for ResNet-101
    cd $CAFFE_ROOT/models
    mkdir ResNet-101
    # Rename the Resnet-101 models and put in the ResNet-101 direcotry
    ls $CAFFE_ROOT/models/ResNet-101
    # $CAFFE_ROOT/models/ResNet-101/ResNet-101-model.caffemodel
    # $CAFFE_ROOT/models/ResNet-101/ResNet-101-deploy.prototxt

Train/Eval

  1. Train and Eval the SSD model

    # Train the SSD-ResNet-101 321x321
    python examples/ssd/ssd_pascal_resnet_321.py
    # GPU setting may need be change according to the numbers of gpu 
    # models are generated in:
    # $CAFFE_ROOT/models/ResNet-101/VOC0712/SSD_VOC07_321x321
    # Evaluate the model
    cd $CAFFE_ROOT
    ./build/tools/caffe train --solver="./models/ResNet-101/VOC0712/SSD_VOC07_321x321/test_solver.prototxt"  \
    --weights="./models/ResNet-101/VOC0712/SSD_VOC07_321x321/ResNet-101_VOC0712_SSD_VOC07_321x321_iter_80000.caffemodel" \
    --gpu=0
    # batch size in the test.prototxt may need be changed.
    # If the batch size is changed, remeber to change the test_iter in test_solver.prototxt at same time. 
    # It should reach 77.5* mAP at 80k iterations.
  2. Train and Evaluate the DSSD model. In this script, Resnet-101 and SSD related layers are frozen and only the DSSD related layers are trained.

    # Use the SSD-ResNet-101 321x321 as the pretrained model
    python examples/ssd/ssd_pascal_resnet_deconv_321.py
    # Evaluate the model
    cd $CAFFE_ROOT
    ./build/tools/caffe train --solver="./models/ResNet-101/VOC0712/DSSD_VOC07_321x321/test_solver.prototxt"  \
    --weights="./models/ResNet-101/VOC0712/DSSD_VOC07_321x321/ResNet-101_VOC0712_DSSD_VOC07_321x321_iter_30000.caffemodel" \
    --gpu=0
    # It should reach 78.6* mAP at 30k iterations.
  3. Train and Evalthe DSSD model. In this script, we try to fine-tune the entire network. In order to sucessfully finetune the network, we need to freeze all the batch norm related layers in Caffe.

    # Use the DSSD-ResNet-101 321x321 as the pretrained model
    python examples/ssd/ssd_pascal_resnet_deconv_ft_321.py
    # Evaluate the model
    cd $CAFFE_ROOT
    ./build/tools/caffe train --solver="./models/ResNet-101/VOC0712/DSSD_VOC07_FT_321x321/test_solver.prototxt"  \
    --weights="./models/ResNet-101/VOC0712/DSSD_VOC07_FT_321x321/ResNet-101_VOC0712_DSSD_VOC07_FT_321x321_iter_40000.caffemodel" \
    --gpu=0
    # Finetuning the entire network only works for the model with 513x513 inputs not 321x321. 

COCO_Models

  1. We add two scripts for training SSD/DSSD with 513x513 inputs on COCO.

    # Train SSD513-ResNet101 on COCO 
    python examples/ssd/ssd_coco_resnet_513.py
    # Train DSSD513-ResNet101 on COCO and use SSD513 as the pretrained model
    python examples/ssd/ssd_coco_resnet_deconv_513.py
  2. We strongly suggest to use the trained models instead of training from scracth.

    SSD_513_COCO

    DSSD_513_COCO

    # move the compressed files at $CAFFE_ROOT/models/ResNet-101
    cd $CAFFE_ROOT/models/ResNet-101
    tar -vzxf SSD_513_COCO.tar.gz
    tar -vzxf DSSD_513_COCO.tar.gz
  3. In our experiments, the model with 513x513 inputs are trained using NVIDIA P40 which consists of 22GB memory. Because we add extra batch normalization layers, it's important to make the mini-batchs size at least 5 in each gpu. So, if you use the gpu with smaller memory, I don't think you can replicate the results.

caffe-dssd's People

Contributors

shelhamer avatar jeffdonahue avatar yangqing avatar longjon avatar weiliu89 avatar sguada avatar kloudkl avatar sergeyk avatar ronghanghu avatar qipeng avatar lukeyeager avatar flx42 avatar rbgirshick avatar philkr avatar dgolden1 avatar eelstork avatar mavenlin avatar jamt9000 avatar cypof avatar tnarihi avatar erictzeng avatar yosinski avatar mohomran avatar jyegerlehner avatar mtamburrano avatar netheril96 avatar ducha-aiki avatar fyu avatar kkhoot avatar timmeinhardt avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.