Git Product home page Git Product logo

oshot-meta-learning's Introduction

Demo implementation of Self-Supervision & Meta-Learning for One-Shot Unsupervised Cross-Domain Detection

Link to the paper: https://arxiv.org/abs/2106.03496

This paper is an extension of our ECCV20: One-Shot Unsupervised Cross-Domain Detection, the code is therefore based on the original implementation: https://github.com/VeloDC/oshot_detection.

The detection framework is inherited from Maskrcnn-benchmark and uses Pytorch and CUDA.

This readme will guide you through a full run of our method for the Pascal VOC -> AMD benchmarks.

Implementation details

We build on top of Faster-RCNN with a ResNet-50 Backbone pre-trained on ImageNet, 300 top proposals after non-maximum-suppression, anchors at three scales (128, 256, 512) and three aspect ratios (1:1, 1:2, 2:1).

For OSHOT we train the base network for 70k iterations using SGD with momentum set at 0.9, the initial learning rate is 0.001 and decays after 50k iterations. We use a batch size of 1, keep batch normalization layers fixed for both pretraining and adaptation phases and freeze the first 2 blocks of ResNet50. The weight of the rotation task is set to λ=0.05.

FULL-OSHOT is actually trained in two steps. For the first 60k iterations the training is identical to that of OSHOT, while in the last 10k iterations the meta-learning procedure is activated. The inner loop optimization on the self-supervised task runs with η=5 iterations and the batch size is 2 to accomodate for two transformations of the original image. Specifically we used gray-scale and color-jitter with brightness, contrast, saturation and hue all set to 0.4. All the other hyperparameters remain unchanged as in OSHOT.

Tran-OSHOT differs from OSHOT only for the last 10k learning iterations, where the batch size is 2 and the network sees multiple images with different visual appearance in one iteration. Meta-OSHOT is instead identical to FULL-OSHOT, made exception for the transformations which are dropped, thus the batch size is 1 also in the last 10k pretraining iterations.

The adaptation phase is the same for all the variants: the model obtained from the pretraining phase is updated via fine-tuning of the self-supervised task. The batch size is equal to 1 and a dropout with probability p = 0.5 is added before the rotation classifier to prevent overfitting. The weight of the auxiliary task is increased to λ=0.2 to speed up the adaptation process. All the other hyperparameters and settings are the same used during the pretraining.

Installation

Check INSTALL.md for installation instructions.

Datasets

Create a folder named datasets and include VOC2007 and VOC2012 source datasets (download from Pascal VOC's website).

Download and extract clipart1k, comic2k and watercolor2k from authors' website.

If you would like to get our Social Bikes dataset, please contact me (Francesco Cappio) via email.

Performing pretraining

To perform a standard OSHOT pretraing using Pascal VOC as source dataset:

python tools/train_net.py --config-file configs/amd/voc_pretrain.yaml

To perform an improved pretraining using our meta-learning based procedure:

python tools/train_net.py --config-file configs/amd/voc_pretrain_meta.yaml --meta

Once you have performed a pretraining you can test the output model directly on the target domain or perform the one-shot adaptation.

Testing pretrained model

You can test a pretrained model on one of the AMD referring to the correct config-file. For example for clipart:

python tools/test_net.py --config-file configs/amd/oshot_clipart_target.yaml --ckpt <pretrain_output_dir>/model_final.pth

Performing the one-shot adaptation

To use OSHOT adaptation procedure and obtain results on one of the AMD please refer to one of the config files. For example for clipart:

python tools/oshot_net.py --config-file configs/amd/oshot_clipart_target.yaml --ckpt <pretrain_output_dir>/model_final.pth

To perform the one-shot adaptation on a model trained with meta learning you need to refer the corresponding _meta config file:

python tools/oshot_net.py --config-file configs/amd/oshot_clipart_target_meta.yaml --ckpt <meta_pretrain_output_dir>/model_final.pth

Qualitative results

Some visualizations for OSHOT, Full-OSHOT and baseline methods:

Qualitative results

oshot-meta-learning's People

Contributors

francescocappio avatar

Stargazers

Gabriele avatar Eros Fanì avatar Antonio Tavera avatar Niccolò Cavagnero avatar Riccardo Zaccone avatar Gabriele Tiboni avatar Jacob A Rose avatar Mirco Planamente avatar silvia1993 avatar Debora Caldarola avatar Luca Robbiano avatar Antonio Alliegro avatar Sage☘ avatar Gabriele Berton avatar Weitao Wan avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.