Git Product home page Git Product logo

eoal's Introduction

Bardia Safaei, Vibashan VS, Celso M. de Melo, Vishal M. Patel

Framework: PyTorch

Active Learning (AL) aims to enhance the performance of deep models by selecting the most informative samples for annotation from a pool of unlabeled data. Despite impressive performance in closed-set settings, most AL methods fail in real-world scenarios where the unlabeled data contains unknown categories. Recently, a few studies have attempted to tackle the AL problem for the open-set setting. However, these methods focus more on selecting known samples and do not efficiently utilize unknown samples obtained during AL rounds. In this work, we propose an Entropic Open-set AL (EOAL) framework which leverages both known and unknown distributions effectively to select informative samples during AL rounds. Specifically, our approach employs two different entropy scores. One measures the uncertainty of a sample with respect to the known-class distributions. The other measures the uncertainty of the sample with respect to the unknown-class distributions. By utilizing these two entropy scores we effectively separate the known and unknown samples from the unlabeled data resulting in better sampling. Through extensive experiments, we show that the proposed method outperforms existing state-of-the-art methods on CIFAR-10, CIFAR-100, and TinyImageNet datasets.

framework

Table of Contents

Setup and Dependencies

  1. Create and activate a conda environment with Python 3.7 as follows:
conda create -n EOAL python=3.7.16
conda activate EOAL
  1. Install dependencies:
pip install -r environment.txt
  1. Modify the dataloader.py file in the torch.util.data.Dataloader source code as described here.

Run

First, create a folder ~/data, the datasets will be automatically downloaded to this folder upon running the code.

CIFAR-10

For the CIFAR-10 experiment with a mismatch ratio of 20%, run:

python main.py --query-strategy eoal --init-percent 1 --known-class 2 --query-batch 1500 --seed 1 --model resnet18 --dataset cifar10 --max-query 11 --max-epoch 300 --stepsize 60 --diversity 1 --gpu 0

CIFAR-100

For the CIFAR-100 experiment with a mismatch ratio of 20%, run:

python main.py --query-strategy eoal --init-percent 8 --known-class 20 --query-batch 1500 --seed 1 --model resnet18 --dataset cifar100 --max-query 11 --max-epoch 300 --stepsize 60 --diversity 1 --gpu 0

Here is a description of some important arguments:

  1. --query-strategy: # Active sampling method. Supports random and eoal.
  2. --init-percent: # Initial labeled data percentage. Use 1 for cifar10 and 8 for cifar100 and tiny-imagenet.
  3. --known-class: # Mismatch ratio. For 20%, 30%, 40%, use 2,3,4 and 20,30,40 for cifar10 and cifar100, respectively.
  4. --query-batch: # Annotation budget. Use 1500 for all experiments
  5. --max-query: # Number of AL cycles. Use 11 for 10 AL cycles (the first round shows the performance on the initial labeled data).
  6. --diversity: # Whether to use diversity sampling. The default value is 1.

Reference

If you find this codebase useful in your research, please consider citing our paper:

@inproceedings{safaei2024entropic,
  title={Entropic open-set active learning},
  author={Safaei, Bardia and Vibashan, VS and de Melo, Celso M and Patel, Vishal M},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={38},
  number={5},
  pages={4686--4694},
  year={2024}
}

Acknowledgements

This code is built upon LfOSA repository.

eoal's People

Contributors

bardisafa avatar eltociear avatar

Stargazers

 avatar Guilherme Dal Bianco avatar  avatar Jaehyuk Heo avatar Riccardo Zuliani avatar Flaneur avatar  avatar Roland Oruche avatar yeep avatar Sacha Lévy avatar Praise avatar Haokun Zhang avatar  avatar BinHong Liu avatar kid_ avatar Junming WANG avatar Vibashan VS avatar

Watchers

 avatar

Forkers

eltociear

eoal's Issues

run time?

How long does the training run take on Cifar100 dataset, with a mismatch rate of 20% and other options as default?
I took 20 hours up to 5 cycles with RTX3090. Is it right for it to take this long?

Tiny IMAGENET not present

I found that in the argument parser there is the option of choosing the tinyimagenet dataset, however in the dataset.py there isn't any link of that.

Moreover the dictionary __factory do not have the tinyimagenet key:

__factory = {
    'cifar100': CIFAR100,
    'cifar10': CIFAR10
}

Maybe because the tinyimagenet is not implemented on torchvision.dataset, so how do you get and use it?

Thanks for the feedback!

Inquiry about MQNet and LfOSA implementation

Hi,

Thank you for sharing your research.
I'm wondering if you used the official code for MQNet and LfOSA or reproduced it for your experiments.

If you reproduced it, can you share it?

best regards,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.