jacobgil / pytorch-grad-cam Goto Github PK

Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.

Home Page: https://jacobgil.github.io/pytorch-gradcam-book

License: MIT License

Python 100.00%

deep-learning pytorch grad-cam visualizations interpretability interpretable-ai interpretable-deep-learning score-cam class-activation-maps vision-transformers

pytorch-grad-cam's Introduction

Advanced AI explainability for PyTorch

pip install grad-cam

Documentation with advanced tutorials: https://jacobgil.github.io/pytorch-gradcam-book

This is a package with state of the art methods for Explainable AI for computer vision. This can be used for diagnosing model predictions, either in production or while developing models. The aim is also to serve as a benchmark of algorithms and metrics for research of new explainability methods.

⭐ Comprehensive collection of Pixel Attribution methods for Computer Vision.

⭐ Tested on many Common CNN Networks and Vision Transformers.

⭐ Advanced use cases: Works with Classification, Object Detection, Semantic Segmentation, Embedding-similarity and more.

⭐ Includes smoothing methods to make the CAMs look nice.

⭐ High performance: full support for batches of images in all methods.

⭐ Includes metrics for checking if you can trust the explanations, and tuning them for best performance.

Method	What it does
GradCAM	Weight the 2D activations by the average gradient
HiResCAM	Like GradCAM but element-wise multiply the activations with the gradients; provably guaranteed faithfulness for certain models
GradCAMElementWise	Like GradCAM but element-wise multiply the activations with the gradients then apply a ReLU operation before summing
GradCAM++	Like GradCAM but uses second order gradients
XGradCAM	Like GradCAM but scale the gradients by the normalized activations
AblationCAM	Zero out activations and measure how the output drops (this repository includes a fast batched implementation)
ScoreCAM	Perbutate the image by the scaled activations and measure how the output drops
EigenCAM	Takes the first principle component of the 2D Activations (no class discrimination, but seems to give great results)
EigenGradCAM	Like EigenCAM but with class discrimination: First principle component of Activations*Grad. Looks like GradCAM, but cleaner
LayerCAM	Spatially weight the activations by positive gradients. Works better especially in lower layers
FullGrad	Computes the gradients of the biases from all over the network, and then sums them
Deep Feature Factorizations	Non Negative Matrix Factorization on the 2D activations

Visual Examples

What makes the network think the image label is 'pug, pug-dog'	What makes the network think the image label is 'tabby, tabby cat'	Combining Grad-CAM with Guided Backpropagation for the 'pug, pug-dog' class

Object Detection and Semantic Segmentation

Object Detection	Semantic Segmentation

Explaining similarity to other images / embeddings

Deep Feature Factorization

Classification

Resnet50:

Category	Image	GradCAM	AblationCAM	ScoreCAM
Dog
Cat

Vision Transfomer (Deit Tiny):

Category	Image	GradCAM	AblationCAM	ScoreCAM
Dog
Cat

Swin Transfomer (Tiny window:7 patch:4 input-size:224):

Category	Image	GradCAM	AblationCAM	ScoreCAM
Dog
Cat

Metrics and Evaluation for XAI

Choosing the Target Layer

You need to choose the target layer to compute CAM for. Some common choices are:

FasterRCNN: model.backbone
Resnet18 and 50: model.layer4[-1]
VGG and densenet161: model.features[-1]
mnasnet1_0: model.layers[-1]
ViT: model.blocks[-1].norm1
SwinT: model.layers[-1].blocks[-1].norm1

If you pass a list with several layers, the CAM will be averaged accross them. This can be useful if you're not sure what layer will perform best.

Using from code as a library

from pytorch_grad_cam import GradCAM, HiResCAM, ScoreCAM, GradCAMPlusPlus, AblationCAM, XGradCAM, EigenCAM, FullGrad
from pytorch_grad_cam.utils.model_targets import ClassifierOutputTarget
from pytorch_grad_cam.utils.image import show_cam_on_image
from torchvision.models import resnet50

model = resnet50(pretrained=True)
target_layers = [model.layer4[-1]]
input_tensor = # Create an input tensor image for your model..
# Note: input_tensor can be a batch tensor with several images!

# Construct the CAM object once, and then re-use it on many images:
cam = GradCAM(model=model, target_layers=target_layers)

# You can also use it within a with statement, to make sure it is freed,
# In case you need to re-create it inside an outer loop:
# with GradCAM(model=model, target_layers=target_layers) as cam:
#   ...

# We have to specify the target we want to generate
# the Class Activation Maps for.
# If targets is None, the highest scoring category
# will be used for every image in the batch.
# Here we use ClassifierOutputTarget, but you can define your own custom targets
# That are, for example, combinations of categories, or specific outputs in a non standard model.

targets = [ClassifierOutputTarget(281)]

# You can also pass aug_smooth=True and eigen_smooth=True, to apply smoothing.
grayscale_cam = cam(input_tensor=input_tensor, targets=targets)

# In this example grayscale_cam has only one image in the batch:
grayscale_cam = grayscale_cam[0, :]
visualization = show_cam_on_image(rgb_img, grayscale_cam, use_rgb=True)

# You can also get the model outputs without having to re-inference
model_outputs = cam.outputs

Metrics and evaluating the explanations

from pytorch_grad_cam.utils.model_targets import ClassifierOutputSoftmaxTarget
from pytorch_grad_cam.metrics.cam_mult_image import CamMultImageConfidenceChange
# Create the metric target, often the confidence drop in a score of some category
metric_target = ClassifierOutputSoftmaxTarget(281)
scores, batch_visualizations = CamMultImageConfidenceChange()(input_tensor, 
  inverse_cams, targets, model, return_visualization=True)
visualization = deprocess_image(batch_visualizations[0, :])

# State of the art metric: Remove and Debias
from pytorch_grad_cam.metrics.road import ROADMostRelevantFirst, ROADLeastRelevantFirst
cam_metric = ROADMostRelevantFirst(percentile=75)
scores, perturbation_visualizations = cam_metric(input_tensor, 
  grayscale_cams, targets, model, return_visualization=True)

# You can also average accross different percentiles, and combine
# (LeastRelevantFirst - MostRelevantFirst) / 2
from pytorch_grad_cam.metrics.road import ROADMostRelevantFirstAverage,
                                          ROADLeastRelevantFirstAverage,
                                          ROADCombined
cam_metric = ROADCombined(percentiles=[20, 40, 60, 80])
scores = cam_metric(input_tensor, grayscale_cams, targets, model)

Advanced use cases and tutorials:

You can use this package for "custom" deep learning models, for example Object Detection or Semantic Segmentation.

You will have to define objects that you can then pass to the CAM algorithms:

A reshape_transform, that aggregates the layer outputs into 2D tensors that will be displayed.
Model Targets, that define what target do you want to compute the visualizations for, for example a specific category, or a list of bounding boxes.

Here you can find detailed examples of how to use this for various custom use cases like object detection:

These point to the new documentation jupter-book for fast rendering. The jupyter notebooks themselves can be found under the tutorials folder in the git repository.

Smoothing to get nice looking CAMs

To reduce noise in the CAMs, and make it fit better on the objects, two smoothing methods are supported:

aug_smooth=True

Test time augmentation: increases the run time by x6.

Applies a combination of horizontal flips, and mutiplying the image by [1.0, 1.1, 0.9].

This has the effect of better centering the CAM around the objects.
eigen_smooth=True

First principle component of activations*weights

This has the effect of removing a lot of noise.

AblationCAM	aug smooth	eigen smooth	aug+eigen smooth

Running the example script:

Usage: python cam.py --image-path <path_to_image> --method <method> --output-dir <output_dir_path>

To use with a specific device, like cpu, cuda, cuda:0 or mps: python cam.py --image-path <path_to_image> --device cuda --output-dir <output_dir_path>

You can choose between:

GradCAM , HiResCAM, ScoreCAM, GradCAMPlusPlus, AblationCAM, XGradCAM , LayerCAM, FullGrad and EigenCAM.

Some methods like ScoreCAM and AblationCAM require a large number of forward passes, and have a batched implementation.

You can control the batch size with cam.batch_size =

Citation

If you use this for research, please cite. Here is an example BibTeX entry:

@misc{jacobgilpytorchcam,
  title={PyTorch library for CAM methods},
  author={Jacob Gildenblat and contributors},
  year={2021},
  publisher={GitHub},
  howpublished={\url{https://github.com/jacobgil/pytorch-grad-cam}},
}

References

https://arxiv.org/abs/1610.02391
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, Dhruv Batra

https://arxiv.org/abs/2011.08891
Use HiResCAM instead of Grad-CAM for faithful explanations of convolutional neural networks Rachel L. Draelos, Lawrence Carin

https://arxiv.org/abs/1710.11063
Grad-CAM++: Improved Visual Explanations for Deep Convolutional Networks Aditya Chattopadhyay, Anirban Sarkar, Prantik Howlader, Vineeth N Balasubramanian

https://arxiv.org/abs/1910.01279
Score-CAM: Score-Weighted Visual Explanations for Convolutional Neural Networks Haofan Wang, Zifan Wang, Mengnan Du, Fan Yang, Zijian Zhang, Sirui Ding, Piotr Mardziel, Xia Hu

https://ieeexplore.ieee.org/abstract/document/9093360/
Ablation-cam: Visual explanations for deep convolutional network via gradient-free localization. Saurabh Desai and Harish G Ramaswamy. In WACV, pages 972–980, 2020

https://arxiv.org/abs/2008.02312
Axiom-based Grad-CAM: Towards Accurate Visualization and Explanation of CNNs Ruigang Fu, Qingyong Hu, Xiaohu Dong, Yulan Guo, Yinghui Gao, Biao Li

https://arxiv.org/abs/2008.00299
Eigen-CAM: Class Activation Map using Principal Components Mohammed Bany Muhammad, Mohammed Yeasin

http://mftp.mmcheng.net/Papers/21TIP_LayerCAM.pdf
LayerCAM: Exploring Hierarchical Class Activation Maps for Localization Peng-Tao Jiang; Chang-Bin Zhang; Qibin Hou; Ming-Ming Cheng; Yunchao Wei

https://arxiv.org/abs/1905.00780
Full-Gradient Representation for Neural Network Visualization Suraj Srinivas, Francois Fleuret

https://arxiv.org/abs/1806.10206
Deep Feature Factorization For Concept Discovery Edo Collins, Radhakrishna Achanta, Sabine Süsstrunk

pytorch-grad-cam's People

Stargazers

Watchers

Forkers

mrace liviust kastnerkyle soroushmehr benjamesbabala pursh2002 hrehory ruthcfong shubhampachori12110095 lwwang christopher-beckham ml-lab socoi onesmileapp flyingpot albert0147 bx5974 liusifei fendaq ducminhkhoi dichen-cd bemoregt bojanagajic pandinosaurus decastro-alex senjia yian2271368 tobyclh valeriechen ppfeng afcarl littleredhat ran337287 ztyxd lxtgh water2bear seongjulee anirband guillembraso nchucv haiminzhang xiaopingzeng rtanno21609 binz-chen nha6ki bjchen666 vipermdl xiaodongdreams houchaoqun tjjtjjtjj julienyulinma hfutzzw yigitcankaya shiruipeng1985 hawksword sixitingting culv yaokeepmoving shiutang-li manjrekarom githubpgq sungjae-cho dreamyit whq-hqw feiward cwpeng-cn leemdawoon goodluck-hojae linranran gepeng18 dddzg cl2227619761 dontlovebugs tensor9 cielal gumpfly wanggcong yezhengli-mr9 fuxuliu danifos sweden1003 wengdunfang baogiadoan colaalex111 bentengma winyi chaoyuhong suprespark mrxuxuxu upgirlnana jinwchoi zp1018 dwang68 ssgalitsky hardware-alchemy b-kartal elena-ssq shuyufranky tangbohu naveengandla

pytorch-grad-cam's Issues

Question About 'GuidedBackpropReLUModel'

https://github.com/jacobgil/pytorch-grad-cam/blob/master/grad-cam.py # 176

one_hot.backward(retain_variables=True)  # not used

'gb.jpg' and 'cam_gb.jpg' are both noise image.

'image not found' error

I see that several people get the image not found error.
Here is my solution: 'brew install libomp'

Added a comment to Readme:
https://github.com/punnerud/pytorch-grad-cam

GradCAM does not match exactly. Why?

Hi, @mingloo @jacobgil @flyingpot

As a result of classifying with Resnet, Accuarcy is over 99%. If you hit map the object area with gradCAM with that model file, it does not match exactly. Why?
it does not match exactly. Why?

It seems to be a problem of GradCAM rather than Resnet classification learning. The objects to be hit-mapped are not as local or blob like dogs or cats, but close to a long straight line. In this case, GradCAM seems to miss the object area. Have you experienced this?

For a well-trainedd Resnet34 model, how do you optimize GradCAM?

Thanks, in advance.

from @bemoregt.

Grad cam on 3d Point cloud data

@mingloo @jacobgil thanks for open sourcing the code , can we perform grad cam on the point cloud data for 3D object detection architecture ? if so can you please share the method or references

How can I implement Grad-CAM in object detection model??

I find all code about Grad-CAM on github to be aimed classification model, so how can I implement Grad_CAM on object detection model that has two tasks: classification and regression??

use resnet

Resnet model hasn't .features.
How I can use it using your code?

googlenet version

can anyone please tell me how to apply grad-cam on googlenet?
is there any source code in pytorch?
thanks!

Why the results were different after each run?

I implemented grad-cam with a program from another project (using tensorflow to train my own data), but the results were different after each run. Have you ever experienced this?
Your help is very important to me. I hope to receive your reply.

gradients while training

I have trained object detection with requires_grad=False, is this the reason i am getting none while doing backpopagation ?

get_gradients() is empty

When extracting the gradient values in the call method of the GradCam class:

self.extractor.get_gradients()[-1].cpu().data.numpy()

the get_cradients() call, returns an empty list, and therefore the [-1] produces an IndexError exception.

The only different thing that I am doing is to use a custom ResNet18 model to work with grayscale images that do not follow the standard (224, 224) size. Has anyone faced this problem before?

Why feed different input (same label) to the network get different grad_val?

When I feed different image, which share the same label, to the network, I check the grad_val. But, I find that, the grad_val corresponding to different images are different. Actually, I think, for the same class, the grad_val should be the same, which is related to the parameters.

So, how should I solve the problem?

Looking forward to your reply.

Cannot generate the cat.jpg

I tried to set the target_index = 281, which is supposed to be tabby cat i (ref: https://gist.github.com/yrevar/942d3a0ac09ec9e5eb3a). However, I cannot reproduce the results in the cat.jpg

Two questions?

Thanks for sharing this amazing code. I have two questions, the one is that how to retrain or fine-tune VGG network with my own data, such as faces with different attributes? The other is that how to use the label information?

bugs

In FeatureExtractor class:
for name, module in self.model._modules.items(): x = module(x)

may be error, when the net contain 'view' op.

Applying to customized pre-trained model.

Anything to Modify when we apply to new customized Model, like Resnet50, I got the following error. In the first run, it works, when I run it for the second time, It raises the following error. (My Resnet model is customized for Binary classifier as a feature extractor).

AttributeError Traceback (most recent call last)
in ()
16 #model = models.resnet50(pretrained=True)
17 model = model_ft
---> 18 grad_cam = GradCam(model=model, feature_module=model.layer4, target_layer_names=["2"], use_cuda=use_cuda)
19
20 img = cv2.imread(image_path, 1)

2 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in train(self, mode)
1070 self.training = mode
1071 for module in self.children():
-> 1072 module.train(mode)
1073 return self
1074

AttributeError: 'builtin_function_or_method' object has no attribute 'train'

sorting gradients and features for multiple layer

this is not a issue so much as a suggestion,

I have been using you code to analyse multiple layers in my network, one of the things I have added is to reverse the list which stores the gradients:
gradients.reverse()

This is because the outputs/targets are recorded on the forward pass - therefore the first output corresponds to the first layer I am calculating the activation map for.

whereas, the gradients are recorded in back propagation - therefore the first gradient corresponds to the last layer I am calculating the activation map for.

therefore in order to make sure I have the correct output and gradient combination the gradients.reverse() is required. I have modified the code too much to know exactly where the best place to call this is but I think it should be within get gradients function:

def get_gradients(self): self.feature_extractor.gradients.reverse() return self.feature_extractor.gradients

Problem with Applying GradCam with Dataloaders

I've been trying to create a small wrapper to apply GradCam as any other model through a data loader.

But I have a problem with dimension outputs, GradCam is outputing wrong dimensions for the last batch, I don't understand why.

Here is bellow a minimal reproducible code that shows my problem. Could you please help me understanding what I'm doing wrong either in GradCaMExplainer ?

# import some modules

import torch
from torch.utils.data import Dataset, DataLoader
import torchvision.models as models
import torch.nn as nn
from torch.nn import functional as F
import numpy as np

################################################################
############ EXACT COPY OF CURRENT REPO ########################
class _BaseWrapper(object):
    def __init__(self, model):
        super(_BaseWrapper, self).__init__()
        self.device = next(model.parameters()).device
        self.model = model
        self.handlers = []  # a set of hook function handlers

    def _encode_one_hot(self, ids):
        one_hot = torch.zeros_like(self.logits).to(self.device)
        one_hot.scatter_(1, ids, 1.0)
        return one_hot

    def forward(self, image):
        self.image_shape = image.shape[2:]
        self.logits = self.model(image)
        self.probs = F.softmax(self.logits, dim=1)
        return self.probs.sort(dim=1, descending=True)  # ordered results

    def backward(self, ids):
        """
        Class-specific backpropagation
        """
        one_hot = self._encode_one_hot(ids)
        self.model.zero_grad()
        self.logits.backward(gradient=one_hot, retain_graph=True)

    def generate(self):
        raise NotImplementedError

    def remove_hook(self):
        """
        Remove all the forward/backward hook functions
        """
        for handle in self.handlers:
            handle.remove()


class BackPropagation(_BaseWrapper):
    def forward(self, image):
        self.image = image.requires_grad_()
        return super(BackPropagation, self).forward(self.image)

    def generate(self):
        gradient = self.image.grad.clone()
        self.image.grad.zero_()
        return gradient

class GradCAM(_BaseWrapper):
    """
    "Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization"
    https://arxiv.org/pdf/1610.02391.pdf
    Look at Figure 2 on page 4
    """

    def __init__(self, model, candidate_layers=None):
        super(GradCAM, self).__init__(model)
        self.fmap_pool = {}
        self.grad_pool = {}
        self.candidate_layers = candidate_layers  # list

        def save_fmaps(key):
            def forward_hook(module, input, output):
                self.fmap_pool[key] = output.detach()

            return forward_hook

        def save_grads(key):
            def backward_hook(module, grad_in, grad_out):
                self.grad_pool[key] = grad_out[0].detach()

            return backward_hook

        # If any candidates are not specified, the hook is registered to all the layers.
        for name, module in self.model.named_modules():
            if self.candidate_layers is None or name in self.candidate_layers:
                self.handlers.append(module.register_forward_hook(save_fmaps(name)))
                self.handlers.append(module.register_backward_hook(save_grads(name)))

    def _find(self, pool, target_layer):
        if target_layer in pool.keys():
            return pool[target_layer]
        else:
            raise ValueError("Invalid layer name: {}".format(target_layer))

    def generate(self, target_layer):
        fmaps = self._find(self.fmap_pool, target_layer)
        grads = self._find(self.grad_pool, target_layer)
        weights = F.adaptive_avg_pool2d(grads, 1)
        gcam = torch.mul(fmaps, weights).sum(dim=1, keepdim=True)
        gcam = F.relu(gcam)
        gcam = F.interpolate(
            gcam, self.image_shape, mode="bilinear", align_corners=False
        )

        B, C, H, W = gcam.shape
        gcam = gcam.view(B, -1)
        gcam -= gcam.min(dim=1, keepdim=True)[0]
        gcam /= gcam.max(dim=1, keepdim=True)[0]
        gcam = gcam.view(B, C, H, W)

        return gcam
################################################################



class GradCaMExplainer(torch.nn.Module):
    """
    Creates a torch module for grad cam
    """
    def __init__(self, model, target_layer="layer4.1.conv2", topk=1):
        super(GradCaMExplainer, self).__init__()
        self.back_propagator = BackPropagation(model=model)
        self.grad_cam = GradCAM(model=model)
        self.topk = topk
        self.target_layer = target_layer

    def forward(self, x):
        probs, ids = self.back_propagator.forward(x) # sorted
        self.back_propagator.remove_hook()        

        _ = self.grad_cam.forward(x)
        self.grad_cam.remove_hook()
        for i in range(self.topk):
            # Grad-CAM
            self.grad_cam.backward(ids=ids[:, [i]])
            regions = self.grad_cam.generate(target_layer=self.target_layer)       
        return regions
    
def gradcam_explain(grad_explainer, dataloader, with_target=False, device='cpu'):
    """
    This outputs explanations for an entire dataloader
    """
    res_region = []
    res_probs = []
    res_ids = []
    c = 0
    c2 = 0
    for batch in dataloader:
        if with_target:
            inputs, targets = batch
        else:
            inputs = batch
        c += inputs.shape[0]
        inputs = inputs.to(device)
        regions = grad_explainer(inputs)
        print("batch regions", regions.shape)
        print("----------")
        c2 += regions.shape[0]
        res_region.append(regions.to("cpu").numpy())
    print("total element ", c)
    print("total regions ", c2)
    return np.vstack(res_region)

# Let's take any pretrained model
model = models.resnet18(pretrained=True)


# Let's create 10 random images
random_images = torch.rand(10, 3, 224, 224)

class BasicDataset(Dataset):
    """
    The simplest dataset you can imagine
    """
    def __init__(self, X):
        self.X = X
        
    def __len__(self):
        return self.X.shape[0]
    
    def __getitem__(self, idx):
        return self.X[idx, :, :, :]

dataset = BasicDataset(random_images)
# set batch size to 8 and drop last to False so that first batch is 8 and second 2
dataloader = DataLoader(dataset, batch_size=8, drop_last=False)

## check sizes
print("Check batch sizes")
for batch in dataloader:
    print(batch.shape)
print("---------------")


grad_explainer = GradCaMExplainer(model)

regions = gradcam_explain(grad_explainer, dataloader, with_target=False, device='cpu')

As you will see if you run this code snippet it will output :

Check batch sizes
torch.Size([8, 3, 224, 224])
torch.Size([2, 3, 224, 224])
---------------
batch regions torch.Size([8, 1, 224, 224])
----------
batch regions torch.Size([8, 1, 224, 224])
----------
total element  10
total regions  16

So for a dataloader with 10 elements and a batch size of 8, I'll end up with 16 explained regions by gradcam instead of 10. The mismatch happens during the last batch but I can't understand why...

Cam_gb.jpg's result is very bad.

cam_gb.jpg's result is very bad. Can I ask you whether the result of your program is the same? But cam_gb.jpg works well in keras(https://github.com/jacobgil/keras-grad-cam). What's the solution?

cam.jpg

gb.jpg

cam_gb.jpg

How can I load & run my own trained convnet model in your grad-cam?

Hi, @mingloo @jacobgil @flyingpot

How can I load & run my own trained convnet model in your grad-cam?

My convnet(CNN) is a 34-layer's resnet, trained using my image datasets.

Thanks in advance.

form Gromit Park.

Why use target_layer_names = ["35"] rather than "34" or something else ?

why not use target_layer_names = ["34"] ? I need to use Densenet, I set target_layer_names = ["norm5"]. I don't know if this is reasonable.
Your help is very important to me. I hope to receive your reply.

Average Pool Layer

Hi,

I was going through the implementation, I found that the 'avgpool' layer is ignored. In the Class "ModelOutputs", in call(), the output from FeatureExtractor is simply passed to the classification part of the network, ignoring the "avgPool" altogether.

Can you please guide me what am I missing?

Regards,
Asim

I cannot understand why recursively applied GuidedBackpropReLU.

I tried non-recursive guided backprop version like this..

for idx, module in module_top._modules.items(): => for name, module in module_top.named_modules():
recursive_relu_apply(module) => #recursive_relu_apply(module)
module_top._modules[idx] = GuidedBackpropReLU.apply => module = GuidedBackpropReLU.apply

but, result was different..
As long as I know, using "named_modules()"(or "modules()") we can access the whole layers in a module. Then, why proper results can only be obtained by recursive manner? Can someone explain about this?
Thank you.

using custom target concept

I have used your code to experiment with grad-cam on YOLO, which I had to come up with a custom way of encoding the target - as opposed to a one-hot encoding of the target class

I am also working on other architectures which have multiple outputs and have again had to come with a custom way of encoding the target

I have seen recently you have moved towards being able use the code as a library, I wondered if there were a means to use a custom target concept without having to modify the code so that it will make my task easier

If not can I suggest it for future implementation

How to get these data(means = [0.485, 0.456, 0.406] stds = [0.229, 0.224, 0.225]) in your code?

In your code same data is as follows:
means = [0.485, 0.456, 0.406] stds = [0.229, 0.224, 0.225]
These are not standard deviation and mean of single image(both.png in your project).And I wonder how to get these data.
Thank you!

The gradient of input has been computed twice

Hi, when running your code, I find that the gradient of input is computed twice,

In gradcam.py
input.requires_grad_ is set True during preprocess at line 71

input = preprocessed_img.requires_grad_(True)

after that, the GradCam model call one_hot.backward(retain_graph=True) at line 116, which compute the gradient of input first time (you can call input.grad.cpu().data.numpy() to verify this), then the GuidedBackpropReLUModel compute the gradient of input again at line 198: one_hot.backward(retain_graph=True) , which will add the gradient to the former one.

you should set the input.requires_grad_=False before compute GradCam and reset input.requires_grad_=True before compute GuidedBackprop

which conv layer I real need?

I use your code on my own CNN network(Inception-v3), and each conv layer's result is shown. However the last two conv layer's output thermal zone are totally diffiernet on one case, which thermal zone is the useful one?

Do I need to use my own classifier

Hi！Thanks for your wonderful work! I have a question about do I need to use my own classifier instead of the original FC layer when I test my own model which treats resnet50 as backbone?

Hope for your answer! @jacobgil

About set requires_grad_ problem

your code:
one_hot = torch.from_numpy(one_hot).requires_grad_(True)
when i change this code follow:
one_hot = torch.from_numpy(one_hot)
,finally ,i get same grad val。。 useful about requires_grad set??

Using 'input_img' or 'input_tensor' instead of 'input'.

input is not a keyword in python, rather a function input().
But using input instead of some other variable name does spur confusion. Most code editors syntax-highlight input, being associated to a function.
A better way would be to use variable name like input_tensor, input_img or input_.

Edit: I have added a PR #47 with the changes, confirming working of code. Please link the PR to this issue.

What's the meaning of "35" in your code?

Hi, @mingloo @jacobgil @flyingpot

In your code:

grad_cam = GradCam(model = models.resnet50(pretrained=True),
target_layer_names = ["35"], use_cuda=args.use_cuda)

What's the meaning of "35" ?

I've got an error when using above code:
AttributeError: 'ResNet' object has no attribute 'features'

What's wrong with me?

Thanks in advance ~

from Gromit Park.

The newest code is not available for the images without 224*224 shape.

I tried to process the original images in the ImageNet dataset, which are not 224*224. However, I got:
" cam = heatmap + np.float32(img)
ValueError: operands could not be broadcast together with shapes (500,334,3) (334,500,3)".
I solved this problem by using the cv.resize(img,(224,224)) opertation, but I don't know if there is still a hidden problem.

Change the retain_variable

In Line 194 and Line 175, retain_variable should replace with retain_graph.

the gb image on my personal network

Hi, I am trying to apply this model on my personal network.
My network has two inputs of different sizes, when I visualize the gb image of the big image, only an area of the image that is the same size as the smaller image will show the gradient.
I would like to know how to solve this problem, thank you very much.

def __call__(self, input_small, input_big, target_category=None):
       if self.cuda:
           input_small = input_small.cuda()
           input_big = input_big.cuda()
       input_small= input_small.requires_grad_(True)
       input_big= input_big.requires_grad_(True)
       cls_score, offsets = self.forward(input_small, input_big) 

       B, _, H, W = cls_score.shape
       cls_score = cls_score.reshape(B, -1)  # 1,5329
       if target_category == None:
           target_category = np.argmax(cls_score.cpu().data.numpy())
       one_hot = np.zeros((1, cls_score.size()[-1]), dtype=np.float32)
       one_hot[0][target_category] = 1
       one_hot = torch.from_numpy(one_hot).requires_grad_(True)
       if self.cuda:
           one_hot = one_hot.cuda()
       one_hot = torch.sum(one_hot * cls_score) 
       one_hot.backward(retain_graph=True)  
       output = input_big.grad.cpu().data.numpy()
       # output2 = input_small.grad.cpu().data.numpy() #(1, 512, 512)
       output = output[0, :, :, :]  #(1, 800, 800)
       #output2=output2[0, :, :, :]  #(1, 512, 512)
       return output #,output2

the differences between ReLU and GuidedBackpropReLU

sorry, I can't see any differences between pytorch regular ReLU and customized GuidedBackpropReLU, Who can teach me? Thanks.

here is the code, and the outputs from them are the same, WHY？ If, they are same, then why we use GuidedBackpropReLU?

import torch
from torch.autograd import Function

class GuidedBackpropReLU(Function):
    '''特殊的ReLU,区别在于反向传播时候只考虑大于零的输入和大于零的梯度'''
    
    
    @staticmethod
    def forward(ctx, input_img):  # torch.Size([1, 64, 112, 112])
        positive_mask = (input_img > 0).type_as(input_img)  # torch.Size([1, 64, 112, 112])
        # output = torch.addcmul(torch.zeros(input_img.size()).type_as(input_img), input_img, positive_mask)
        output = input_img * positive_mask  # 这行代码和上一行的功能相同
        ctx.save_for_backward(input_img, output)
        return output  # torch.Size([1, 64, 112, 112])
    
    # # 上部分定义的函数功能和以下定义的函数一致
    # @staticmethod
    # def forward(ctx, input_img):  # torch.Size([1, 64, 112, 112])
    #     output = torch.clamp(input_img, min=0.0)
    #     # print('函数中的输入张量requires_grad',input_img.requires_grad)
    #     ctx.save_for_backward(input_img, output)
    #     return output  # torch.Size([1, 64, 112, 112])

    @staticmethod
    def backward(ctx, grad_output):  # torch.Size([1, 2048, 7, 7])
        input_img, output = ctx.saved_tensors  # torch.Size([1, 2048, 7, 7]) torch.Size([1, 2048, 7, 7])
        # grad_input = None  # 这行代码没作用
        positive_mask_1 = (input_img > 0).type_as(grad_output)  # torch.Size([1, 2048, 7, 7])  输入的特征大于零
        positive_mask_2 = (grad_output > 0).type_as(grad_output)  # torch.Size([1, 2048, 7, 7])  梯度大于零
        grad_input = torch.addcmul(
                                    torch.zeros(input_img.size()).type_as(input_img),
                                    torch.addcmul(
                                                    torch.zeros(input_img.size()).type_as(input_img), 
                                                    grad_output,
                                                    positive_mask_1
                                    ), 
                                    positive_mask_2
        )
        # grad_input = grad_output * positive_mask_1 * positive_mask_2  # 这行代码的作用和上一行代码相同
        return grad_input


torch.manual_seed(seed=20200910)

size = (100,500)
input_data_1 = input = torch.randn(*size, requires_grad=True)

torch.manual_seed(seed=20200910)
input_data_2 = input = torch.randn(*size, requires_grad=True)

torch.manual_seed(seed=20200910)
input_data_3 = input = torch.randn(*size, requires_grad=True)
print('这三个输入数据的维度分别是:', input_data_1.shape, input_data_2.shape, input_data_3.shape)
# print(input_data_1)
# print(input_data_2)
# print(input_data_3)


loss_1 = torch.sum(torch.nn.ReLU()(input_data_1))
loss_2 = torch.sum(torch.nn.functional.relu(input_data_2))
loss_3 = torch.sum(GuidedBackpropReLU.apply(input_data_3))


loss_1.backward()
loss_2.backward()
loss_3.backward()

print(loss_1, loss_2, loss_3)
print(loss_1.item(), loss_2.item(), loss_3.item())
print('三个损失值是否相等', loss_1.item() == loss_2.item() == loss_3.item())

print('简略打印三个梯度信息...')
print(input_data_1.grad)
print(input_data_2.grad)
print(input_data_3.grad)
print('这三个梯度的维度分别是:', input_data_1.grad.shape, input_data_2.grad.shape, input_data_3.grad.shape)

print('检查这三个梯度是否两两相等...')
print(torch.equal(input_data_1.grad, input_data_2.grad))
print(torch.equal(input_data_1.grad, input_data_3.grad))
print(torch.equal(input_data_2.grad, input_data_3.grad))

here is the output in cmd:

Windows PowerShell
版权所有 (C) Microsoft Corporation。保留所有权利。

尝试新的跨平台 PowerShell https://aka.ms/pscore6

加载个人及系统配置文件用了 850 毫秒。
(base) PS C:\Users\chenxuqi\Desktop\News4cxq\test4cxq> conda activate ssd4pytorch1_2_0
(ssd4pytorch1_2_0) PS C:\Users\chenxuqi\Desktop\News4cxq\test4cxq>  & 'D:\Anaconda3\envs\ssd4pytorch1_2_0\python.exe' 'c:\Users\chenxuqi\.vscode\extensions\ms-python.python-2021.1.502429796\pythonFiles\lib\python\debugpy\launcher' '60985' '--' 'c:\Users\chenxuqi\Desktop\News4cxq\test4cxq\testReLU.py'
这三个输入数据的维度分别是: torch.Size([100, 500]) torch.Size([100, 500]) torch.Size([100, 500])
tensor(19912.3828, grad_fn=<SumBackward0>) tensor(19912.3828, grad_fn=<SumBackward0>) tensor(19912.3828, grad_fn=<SumBackward0>)
19912.3828125 19912.3828125 19912.3828125
三个损失值是否相等 True
简略打印三个梯度信息...
tensor([[1., 1., 1.,  ..., 1., 1., 1.],
        [1., 0., 0.,  ..., 0., 1., 1.],
        [0., 1., 0.,  ..., 1., 1., 1.],
        ...,
        [0., 1., 1.,  ..., 1., 0., 1.],
        [0., 1., 1.,  ..., 1., 1., 1.],
        [0., 0., 1.,  ..., 1., 1., 0.]])
tensor([[1., 1., 1.,  ..., 1., 1., 1.],
        [1., 0., 0.,  ..., 0., 1., 1.],
        [0., 1., 0.,  ..., 1., 1., 1.],
        ...,
        [0., 1., 1.,  ..., 1., 0., 1.],
        [0., 1., 1.,  ..., 1., 1., 1.],
        [0., 0., 1.,  ..., 1., 1., 0.]])
tensor([[1., 1., 1.,  ..., 1., 1., 1.],
        [1., 0., 0.,  ..., 0., 1., 1.],
        [0., 1., 0.,  ..., 1., 1., 1.],
        ...,
        [0., 1., 1.,  ..., 1., 0., 1.],
        [0., 1., 1.,  ..., 1., 1., 1.],
        [0., 0., 1.,  ..., 1., 1., 0.]])
这三个梯度的维度分别是: torch.Size([100, 500]) torch.Size([100, 500]) torch.Size([100, 500])
检查这三个梯度是否两两相等...
True
True
True
(ssd4pytorch1_2_0) PS C:\Users\chenxuqi\Desktop\News4cxq\test4cxq>

CAM with full negative values

Hi, I am trying to apply grad-cam on DL-based detectors.

The performance is not ideal. I notice after the operation on line 126-129:

        for i, w in enumerate(weights):
            cam += w * target[i, :, :]
        cam = np.maximum(cam, 0)

The most part of the cam are negative therefore assigned 0.
When doing a visualization, a region that has a detected objects might not show any response.

Is this because the regression on coordinates interrupt the results?
Is there a way to get around this?

License

Hi,

Thanks for the great work.
Would it be possible for you to add an appropriate license to the work?

I noticed that your corresponding keras implementation is licensed under MIT: https://github.com/jacobgil/keras-grad-cam

Thanks,
Raghav

Why the same image can be classified with two different labels ?

How to applicate cam on personal network?

Hello!
If some special operation is add in forward() function of main network, such as second order pooling , it is difficult to split the network into two part(feature and classifier).Is it any way to applicate CAM method on such personal network?

test_caffe_model

Thanks for your sharing and nice work! @jacobgil
How can i get the Gradient class activation maps with the model trained with caffe rather than pytorch models?
Thank you very much!

Support of Batch Input

Hi,

it appears that the original code did not directly support the batch input. I forked the repo and created some simple modification. (you may discard the part to load my own models)
https://github.com/CielAl/pytorch-grad-cam_batch/blob/master/grad_cam.py#L114

Hope this might be useful in case others try to apply your implementation :)))

Which commit is the stable version, and do you plan to create a tag?

I have fine-tuned VGG19_bn on my custom image set.

Hi, @mingloo @jacobgil @jdecid @ChaiKnight @flyingpot

I have fine-tuned VGG19_bn on my custom image set.
Please guide how to modify the code to use my model.

Thank you!
Best.
@bemoregt.

jacobgil / pytorch-grad-cam Goto Github PK

pytorch-grad-cam's Introduction

Advanced AI explainability for PyTorch

Visual Examples

Object Detection and Semantic Segmentation

Explaining similarity to other images / embeddings

Deep Feature Factorization

Classification

Resnet50:

Vision Transfomer (Deit Tiny):

Swin Transfomer (Tiny window:7 patch:4 input-size:224):

Metrics and Evaluation for XAI

Choosing the Target Layer

Using from code as a library

Metrics and evaluating the explanations

Advanced use cases and tutorials:

Smoothing to get nice looking CAMs

Running the example script:

Citation

References

pytorch-grad-cam's People

Stargazers

Watchers

Forkers

pytorch-grad-cam's Issues

Anything to Modify when we apply to new customized Model, like Resnet50, I got the following error. In the first run, it works, when I run it for the second time, It raises the following error. (My Resnet model is customized for Binary classifier as a feature extractor).

Recommend Projects

Recommend Topics

Recommend Org