Git Product home page Git Product logo

lpcam's Introduction

LPCAM

The official code of CVPR 2023 paper (Extracting Class Activation Maps from Non-Discriminative Features as well). arXiv

Prerequisite

  • Python 3.6, PyTorch 1.9, and others in environment.yml
  • You can create the environment from environment.yml file
conda env create -f environment.yml

Usage (PASCAL VOC)

Step 1. Prepare dataset.

  • Download PASCAL VOC 2012 devkit from official website. Download.
  • You need to specify the path ('voc12_root') of your downloaded devkit in the following steps.

Step 2. Train classification network and generate LPCAM seeds.

  • Please specify a workspace to save the model and logs.
CUDA_VISIBLE_DEVICES=0 python run_sample.py --voc12_root ./VOCdevkit/VOC2012/ --work_space YOUR_WORK_SPACE --train_cam_pass True --make_cam_pass True --make_lpcam_pass True --eval_cam_pass True 

Step 3. Train IRN and generate pseudo masks.

CUDA_VISIBLE_DEVICES=0 python run_sample.py --voc12_root ./VOCdevkit/VOC2012/ --work_space YOUR_WORK_SPACE --cam_to_ir_label_pass True --train_irn_pass True --make_sem_seg_pass True --eval_sem_seg_pass True 

You can download the pseudo labels from this link.

Step 4. Train semantic segmentation network.

To train DeepLab-v2, we refer to deeplab-pytorch. We use the ImageNet pre-trained model for DeepLabV2 provided by AdvCAM. Please replace the groundtruth masks with generated pseudo masks.

Usage (MS COCO)

Step 1. Prepare dataset.

  • Download MS COCO images from the official COCO website.
  • Generate mask from annotations (annToMask.py file in ./mscoco/).
  • Download MS COCO image-level labels provided by ReCAM from here and put them in ./mscoco/

Step 2. Train classification network and generate LPCAM seeds.

  • Please specify a workspace to save the model and logs.
CUDA_VISIBLE_DEVICES=0 python run_sample_coco.py --mscoco_root ../MSCOCO/ --work_space YOUR_WORK_SPACE --train_cam_pass True --make_cam_pass True --make_lpcam_pass True --eval_cam_pass True 

Step 3. Train IRN and generate pseudo masks.

CUDA_VISIBLE_DEVICES=0 python run_sample_coco.py --mscoco_root ../MSCOCO/ --work_space YOUR_WORK_SPACE --cam_to_ir_label_pass True --train_irn_pass True --make_sem_seg_pass True --eval_sem_seg_pass True 

You can download the pseudo labels from this link.

Step 4. Train semantic segmentation network.

  • The same as PASCAL VOC.

Acknowledgment

This code is borrowed from IRN and ReCAM.

lpcam's People

Contributors

zhaozhengchen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

lpcam's Issues

How to replace the generated pseudo masks

Hello. Firstly, thank you for the great work. I am a new one in the deep learning domain, so the following problem may be a little easy for you.
I am trying to run your code with PASCAL VOC dataset, however, in the step4, I don't know where to put the generated pseudo masks in the step3. In the VOC2012 folder, there are three possible folders, SegmentationClass, SegmentationClassAug(deeplab-pytorch added), and SegmentationObject, I need to put the generated pseudo masks in all the three folders, or just someone?
Waiting for your early reply, thank you.

About Generality of LPCAM.

I really appreciate your work on WSSS.
Now, I have some questions about LPCAM. I would like to know how LPCAM integrates with MCTformer, as reported in your experiment. Could you provide a detailed code or instruction description?

Is this applicable for binary classification problems?

Hello! Thank you very much for your work.
I'd like to know if it's also applicable for binary classification problems (with only one target class and the rest as background), where num_classes is set to 1? I'm currently trying this work, but I'm encountering issues during clustering.

Could you provide the code on EDAM?

Thanks for your work! I want to compare my work with yours on EDAM and MCTformer, but i find that your paper provide the result of EDAM on coco dataset but the original paper doesn't. So can you provide the corresponding code? Looking forward to your reply!

K-Means does not converge for background features

Hello. Firstly, thank you for the great work.

I'm trying to reproduce the results and apply it to my own CAMs, but I'm somewhat stuck during K-Means. Could you please share your features/cams, used as input to LPCAM?

From what I gather, the K-Means over foreground features seem to converge as expected, while running it over background features frequently result in arbitrarily large tol's. Did you experience this as well? Can this be mitigated by some of the hyperparameters?

My only differences to the original code are:

  1. I increased selected_thres = 0.3. Seemed like the sensible thing to do, considering my CAMs have a higher optimal thes.
  2. I increased tol from 0.5 to 30 to accelerate training
  3. I added a max_iteration=100 parameter in kmeans_pytorch lib in order to prevent infinite loops

Here are training times for a few classes (plane, bike, bird, boat, bottle), for reference:

class id:  0 , class name: aeroplane
features selected    : torch.Size([14363, 2048]) torch.float32
features not selected: torch.Size([47038, 2048]) torch.float32
running k-means on cuda:0..
[running kmeans]: 10it [00:05,  1.89it/s, center_shift=22.108383, iteration=10, tol=30.000000]  
running k-means on cuda:0..
[running kmeans]: 101it [02:30,  1.45s/it, center_shift=18806.734375, iteration=101, tol=30.000000]Interrupted (iteration=101 > max_iteration=100)
[running kmeans]: 101it [02:30,  1.49s/it, center_shift=18806.734375, iteration=101, tol=30.000000]

class id:  1 , class name: bicycle
features selected    : torch.Size([9652, 2048]) torch.float32
features not selected: torch.Size([38520, 2048]) torch.float32
running k-means on cuda:0..
[running kmeans]: 13it [00:04,  3.11it/s, center_shift=21.436689, iteration=13, tol=30.000000]  
running k-means on cuda:0..
[running kmeans]: 101it [01:59,  1.18s/it, center_shift=15778.094727, iteration=101, tol=30.000000]Interrupted (iteration=101 > max_iteration=100)
[running kmeans]: 101it [01:59,  1.18s/it, center_shift=15778.094727, iteration=101, tol=30.000000]

class id:  2 , class name: bird
features selected    : torch.Size([14301, 2048]) torch.float32
features not selected: torch.Size([63019, 2048]) torch.float32
running k-means on cuda:0..
[running kmeans]: 13it [00:05,  2.22it/s, center_shift=29.299744, iteration=13, tol=30.000000]  
running k-means on cuda:0..
[running kmeans]: 101it [03:32,  1.93s/it, center_shift=36542.941406, iteration=101, tol=30.000000]Interrupted (iteration=101 > max_iteration=100)
[running kmeans]: 101it [03:32,  2.10s/it, center_shift=36542.941406, iteration=101, tol=30.000000]

class id:  3 , class name: boat
features selected    : torch.Size([12694, 2048]) torch.float32
features not selected: torch.Size([44381, 2048]) torch.float32
running k-means on cuda:0..
[running kmeans]: 17it [00:06,  2.43it/s, center_shift=29.191572, iteration=17, tol=30.000000]  
running k-means on cuda:0..
[running kmeans]: 101it [02:14,  1.33s/it, center_shift=51663.605469, iteration=101, tol=30.000000]Interrupted (iteration=101 > max_iteration=100)
[running kmeans]: 101it [02:14,  1.34s/it, center_shift=51663.605469, iteration=101, tol=30.000000]

class id:  4 , class name: bottle
features selected    : torch.Size([7840, 2048]) torch.float32
features not selected: torch.Size([57216, 2048]) torch.float32
running k-means on cuda:0..
[running kmeans]: 19it [00:05,  3.78it/s, center_shift=22.918352, iteration=19, tol=30.000000]  
running k-means on cuda:0..
[running kmeans]: 99it [02:50,  1.73s/it, center_shift=15.979640, iteration=99, tol=30.000000]    

[...]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.