Git Product home page Git Product logo

deformable-detr's Introduction

Deformable DETR: Deformable Transformers for End-to-End Object Detection

Hakim Chekirou, Celina Hanouti & Aymen Merrouche (Equal Contribution)

Poster

This repository contains an implementation of the paper "Deformable Transformers for End-to-End Object Detection" : https://arxiv.org/pdf/2010.04159.pdf

We provide scratch implementation of the following modules :

  • deformable_transformer.py, decoder.py, encoder.py, deformable_detr.py and MultiHeadAttention.py.

the remaining modules are mainly copied from the original DETR implementation : https://github.com/facebookresearch/detr

Requirements

pip install -r requirements.txt

Dataset

The coco_extraction.py script provides functions for creating an annotation file for coco with only the specified class indexes. It removes all images and bounding boxes not containning at least one of those classes. We kept only five randomly sampled classes from 91 available, bear, bus, tie, toilet and vase. The annotation files are datasets/coco_light/coco_light_train.json for the train set and datasets/coco_light/coco_light_train.json for the validation set. Coco lignt contains ~ 15K images in the train set and 656 images on the validation set.

Usage

git clone https://github.com/hanouticelina/deformable-DETR.git

cd deformable-DETR

Training

The command for training Deformable DETR is as following:

python main.py --enc_layers 3 --dec_layers 3 --batch_size 1

Training convergence takes 72 GPU hours on a single GPU GeForce RTX 2080.

Evaluation

To evaluate Deformable DETR on a subset of COCO 2017 validation set with a single GPU run:

<path to config file> --resume <path to pre-trained model> --eval

Pre-trained model can be found at : https://www.dropbox.com/s/vnkbfrui1ldwtah/checkpoint.pth?dl=0

References

Xizhou Zhu et al., Deformable Transformers for End-to-End Object Detection.

Nicolas Carion et al., End-to-End Object Detection with Transformers.

deformable-detr's People

Contributors

aymenmerrouche avatar chekirou avatar hanouticelina avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.