Git Product home page Git Product logo

crosskd's Introduction

๐ŸŒŸ CrossKD: Cross-Head Knowledge Distillation for Dense Object Detection ๐ŸŒŸ

Python 3.8 pytorch 1.12.1

This repository contains the official implementation of the following paper:

CrossKD: Cross-Head Knowledge Distillation for Dense Object Detection
Jiabao Wang*, Yuming Chen*, Zhaohui Zheng๏ผŒXiang Li, Ming-Ming Cheng, Qibin Hou*
(* denotes equal contribution)
VCIP, School of Computer Science, Nankai University

[Arxiv Paper]

Introduction

Knowledge Distillation (KD) has been validated as an effective model compression technique for learning compact object detectors. Existing state-of-the-art KD methods for object detection are mostly based on feature imitation, which is generally observed to be better than prediction mimicking. In this paper, we show that the inconsistency of the optimization objectives between the ground-truth signals and distillation targets is the key reason for the inefficiency of prediction mimicking. To alleviate this issue, we present a simple yet effective distillation scheme, termed CrossKD, which delivers the intermediate features of the student's detection head to the teacher's detection head. The resulting cross-head predictions are then forced to mimic the teacher's predictions. Such a distillation manner relieves the student's head from receiving contradictory supervision signals from the ground-truth annotations and the teacher's predictions, greatly improving the student's detection performance. On MS COCO, with only prediction mimicking losses applied, our CrossKD boosts the average precision of GFL ResNet-50 with 1x training schedule from 40.2 to 43.7, outperforming all existing KD methods for object detection.

struture

Get Started

1. Prerequisites

Dependencies

  • Ubuntu >= 20.04
  • CUDA >= 11.3
  • pytorch==1.12.1
  • torchvision=0.13.1
  • mmcv==2.0.0rc4
  • mmengine==0.7.3

Our implementation based on MMDetection==3.0.0rc6. For more information about installation, please see the official instructions.

Step 0. Create Conda Environment

conda create --name openmmlab python=3.8 -y
conda activate openmmlab

Step 1. Install Pytorch

conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch

Step 2. Install MMEngine and MMCV using MIM.

pip install -U openmim
mim install "mmengine==0.7.3"
mim install "mmcv==2.0.0rc4"

Step 3. Install CrossKD.

git clone https://github.com/jbwang1997/CrossKD
cd CrossKD
pip install -v -e .
# "-v" means verbose, or more output
# "-e" means installing a project in editable mode,
# thus any local modifications made to the code will take effect without reinstallation.

Step 4. Prepare dataset follow the official instructions.

2. Training

Single GPU

python tools/train.py configs/crosskd/${CONFIG_FILE} [optional arguments]

Multi GPU

CUDA_VISIBLE_DEVICES=x,x,x,x python tools/dist_train.sh \
    configs/crosskd/${CONFIG_FILE} ${GPU_NUM} [optional arguments]

3. Evaluation

python tools/test.py configs/crosskd/${CONFIG_FILE} ${CHECKPOINT_FILE}

Results

1. GFL

Method schedule AP Config Download
GFL-Res101 (T) 2x+ms 44.9 config model
GFL-Res50 (S) 1x 40.2 config model
CrossKD 1x 43.7 (+3.5) config model
CrossKD+PKD 1x 43.9 (+3.7) config model

2. RetinaNet

Method schedule AP Config Download
RetineNet-Res101 (T) 2x 38.9 config model
RetineNet-Res50 (S) 2x 37.4 config model
CrossKD 2x 39.7 (+2.3) config model
CrossKD+PKD 2x 39.8 (+2.4) config model

3. FCOS

Method schedule AP Config Download
FCOS-Res101 (T) 2x+ms 40.8 config model
FCOS-Res50 (S) 2x+ms 38.5 config model
CrossKD 2x+ms 41.1 (+2.6) config model
CrossKD+PKD 2x+ms 41.3 (+2.8) config model

4. ATSS

Method schedule AP Config Download
ATSS-Res101 (T) 1x 41.5 config model
ATSS-Res50 (S) 1x 39.4 config model
CrossKD 1x 41.8(+2.4) config model
CrossKD+PKD 1x 41.8(+2.4) config model

Citation

If you find our repo useful for your research, please cite us:

@misc{wang2023crosskd,
      title={CrossKD: Cross-Head Knowledge Distillation for Dense Object Detection}, 
      author={Jiabao Wang and Yuming Chen and Zhaohui Zheng and Xiang Li and 
              Ming-Ming Cheng and Qibin Hou},
      year={2023},
      eprint={2306.11369},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

This project is based on the open source codebase MMDetection.

@article{mmdetection,
  title   = {{MMDetection}: Open MMLab Detection Toolbox and Benchmark},
  author  = {Chen, Kai and Wang, Jiaqi and Pang, Jiangmiao and Cao, Yuhang and
             Xiong, Yu and Li, Xiaoxiao and Sun, Shuyang and Feng, Wansen and
             Liu, Ziwei and Xu, Jiarui and Zhang, Zheng and Cheng, Dazhi and
             Zhu, Chenchen and Cheng, Tianheng and Zhao, Qijie and Li, Buyu and
             Lu, Xin and Zhu, Rui and Wu, Yue and Dai, Jifeng and Wang, Jingdong
             and Shi, Jianping and Ouyang, Wanli and Loy, Chen Change and Lin, Dahua},
  journal= {arXiv preprint arXiv:1906.07155},
  year={2019}
}

License

Licensed under a Creative Commons Attribution-NonCommercial 4.0 International for Non-commercial use only. Any commercial use should get formal permission first.

Contact

For technical questions, please contact [email protected] and [email protected].

Acknowledgement

This repo is modified from open source object detection codebase MMDetection.

crosskd's People

Contributors

aronlin avatar bigwangyudong avatar chhluo avatar czm369 avatar daavoo avatar erotemic avatar fishandwasabi avatar hellock avatar hhaandroid avatar jbwang1997 avatar johnson-wang avatar jshilong avatar melikovk avatar mxbonn avatar myownskyw7 avatar oceanpang avatar rangilyu avatar runningleon avatar ryanxli avatar sanbuphy avatar shinya7y avatar thangvubk avatar tianyuandu avatar v-qjqs avatar wswday avatar xvjiarui avatar yhcao6 avatar yuzhj avatar zwwwayne avatar zytx121 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.