Git Product home page Git Product logo

ms2a's Introduction

✨ MS2A: Memory Storage-to-Adaptation for Cross-domain Few-annotation Object Detection [Project Page]

This repository contains the code for reproducing the paper: MS2A: Memory Storage-to-Adaptation for Cross-domain Few-annotation Object Detection

Overview

Archi

  • We propose a novel memory storage-to-adaptation mechanism to learn the prior knowledge and transfer it to feature alignment adaptively. To the best of our knowledge, this is the first work to extract the prior knowledge of unlabeled target data to address the CFOD task.
  • We construct a new challenging benchmark of industrial scenarios, that comprises complex background and large differences between source and target domains.
  • Experiments show that the proposed MS2A outperforms state-of-the-art methods on both public datasets and the constructed industrial dataset. In particular, MS2A exceeds the state-of-the-art method by 10.4% on the challenging industrial dataset for the 10-annotation setting.

Example Results

Qualitative comparisons on public datasets C $\rightarrow$ F setting.

C2F

Qualitative comparisons on public datasets K $\rightarrow$ C setting.

K2C

Qualitative comparisons on proposed industrial datasets Indus_S $\rightarrow$ Indus_T1 setting.

S2T1

Qualitative comparisons on proposed industrial datasets Indus_S $\rightarrow$ Indus_T2 setting.

S2T1

Installation

Please refer to Installation for installation instructions.

Dataset

Please download our proposed Industrial Datasets from here. It consists of two settings, Indus_S $\rightarrow$ Indus_T1 and Indus_S $\rightarrow$ Indus_T2. We use LabemMe for labeling and annotate all the objects as the part class. More information about the size of the training set and test set can be obtained by browsing the table below.

Indus_S→Indus_T1 Indus_S→Indus_T2
Source Target Source Target
train 4614 10、30、50 4614 10、30、50
test 1153 269 1153 432

Pretrained model for quick start

All used pretrained models can be downloaded from here.The yolox_x_8x8_300e_coco_20211126_140254-1ef88d67.pth is the pretrained model from mmdetection. city_foggycity_epoch_20.pth, kitti_city_epoch_18.pth and sim10k_city_epoch_15.pth are pretrained model on public dataset cityscapes2foggycityscapes, kitti2cityscapes and sim10k2cityscapes task respectively.

Training from scratch

Stage I - Memory Storage

1. pretrain base detector
./tools/dist_train.sh ${config_file} ${gpu_number}
e.g: ./tools/dist_train.sh ./configs/yolox/yolox_fs_x_640x640_50e_18_base.py 1

2. extract and cluster the prior knowledge
python extract_fea_fewshot.py \
 --depart ${dataset name} \
 --img ${image dir} \
 --config ${config file} \
 --checkpoint ${checkpoint file} \
 --out_path ${save dir} \
 --batch_size 8 \
 --dim ${128/256 large:256}
e.g: 
python extract_fea_fewshot.py \
 --depart Indus_T1 \
 --img /nas/public_data/foggy_cityscapes_COCO_format/train2014 \
 --config ./configs/yolox/yolox_fs_x_640x640_50e_cityscapes_base.py \
 --checkpoint ./work_dirs/yolox_fs_x_640x640_50e_cityscapes_base/best_bbox_mAP_epoch_70.pth \
 --out_path /path/to/save/extracted/feture \
 --batch_size 8 \
 --dim 320

python knn_cuda.py \
 --fea_root_path ${last step saved dir} \
 --out_path ${default same as fea_root_path} \
 --n_clusters ${default 100} \
 --depart ${cluster name}
e.g:
python knn_cuda.py \
 --fea_root_path /path/to/save/extracted/feture/ \
 --out_path /path/to/save/cluster/results/ \
 --n_clusters 100 \
 --depart Indus_T1

Stage II - Memory Adaptation

1. Source Training
./tools/dist_train.sh ${config_file} ${gpu_number}
e.g: ./tools/dist_train.sh configs/yolox/yolox_fs_x_640x640_50e_cityscapes_pretrain_mom.py 1

2. Target Training
./tools/dist_train.sh ${config_file} ${gpu_number}
e.g: ./tools/dist_train.sh configs/yolox/yolox_fs_x_640x640_50e_cityscapes_pretrain_mom_ft.py 1

Evaluation

python tools/test.py ${config file} ${checkpoint file} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] [--show]
e.g:
python tools/test.py \
 ./configs/yolox/yolox_fs_x_640x640_50e_cityscapes_pretrain_mom_ft.py \
 ./work_dirs/yolox_fs_x_640x640_50e_cityscapes_pretrain_mom_ft/best_0_bbox_mAP_epoch_20.pth \
 --eval bbox

Inference

python tools/inference.py ${config file} ${checkpoint file} ${save dir}
e.g:
python tools/test.py \
 ./configs/yolox/yolox_fs_x_640x640_50e_cityscapes_pretrain_mom_ft.py \
 ./work_dirs/yolox_fs_x_640x640_50e_cityscapes_pretrain_mom_ft/best_0_bbox_mAP_epoch_20.pth \
 ./path/to/save

Thanks

Thanks to the public repos: mmdetection for providing the base code.

ms2a's People

Contributors

zoushilong1024 avatar ms2a-cfod avatar guhuangai avatar

Stargazers

 avatar  avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.