Git Product home page Git Product logo

Comments (10)

frgfm avatar frgfm commented on May 17, 2024 1

@Harry-KIT this is a different topic than this issue but briefly: reading through the frames of a video with OpenCV, classifying the images with the model, computing the CAM, then producing the GIF with the resulting frames 👌

from torch-cam.

Harry-KIT avatar Harry-KIT commented on May 17, 2024 1
ezgif.com-gif-maker.gif.mp4

Hi @frgfm. Thanks for your help!

from torch-cam.

frgfm avatar frgfm commented on May 17, 2024

Hello @long123524 👋

Sorry about this! But I'll need more information for understandable reasons 😅
Your snippet is not runnable, could you share a minimal snippet (including imports) that would reproduce this bug please?

According to your console output , it looks like you used a UNet. Are you tried to get the CAM of a segmentation model? 🤔

from torch-cam.

long123524 avatar long123524 commented on May 17, 2024

You are right. I use a encoder-decoder architecture segmentation model, like UNet. Should I how to obtain a cam? Looking forward to your reply.

from torch-cam.

frgfm avatar frgfm commented on May 17, 2024

Alright, so this is a much wider discussion then!

I suggest checking #150 :)
CAMs were designed to provide spatial influence at a given layer about a scalar output:

  • this scalar output can be the probability of a given class of a multi-class classification model, a binary classification model output, a video classification model. We want a single channel spatial representation meaning (H, W) shape for the CAM, you can simplify this to consider this as the gradient of the activation at a given layer relative to this scalar output.
  • now the problem is that with segmentation models, you get (C, H', W') outputs for 2D models and not (C,). So the "CAM" of a segmentation model, following the previous comparison, would be the partial derivative of the activation at a given layer, relative to both the value & location of the scalar. This means that this "CAM" would not be of shape (H, W) like previously but (H, W, H', W'), which is not exactly fit for visualization.

To make this clearer:

  • in the first case, you check for every spatial location in the activation map, its influence on the scalar output
  • in the second case, that would be checking for every spatial location in the activation map, its influence on the 2D output

So, may I ask: what are you exactly trying to achieve with this CAM? :)

from torch-cam.

long123524 avatar long123524 commented on May 17, 2024

I have the same problem as #150. The aim of our work to obtain a CAM of semantic segmentation model, like UNet. However, I fails to get it.

from torch-cam.

frgfm avatar frgfm commented on May 17, 2024

@long123524 the whole point of #150 is about the relevance of CAM for semantic segmentation

Again, there is no definition for that 😅
So could you elaborate on "obtain a CAM of semantic segmentation model" please? 🙏

from torch-cam.

Harry-KIT avatar Harry-KIT commented on May 17, 2024

Hi @frgfm. Good job!
I wanted to know how you generated video_example_wallaby.gif using using some code.

from torch-cam.

frgfm avatar frgfm commented on May 17, 2024

Back to the main topic of the issue :)
@long123524 I think I'll close this issue if I understood you correctly because there is a need for a definition of a CAM for a semantic segmentation model. So apart from if I got everything wrong, your request is about something that doesn't have a mathematical definition yet 😅

from torch-cam.

frgfm avatar frgfm commented on May 17, 2024

Closing this as per my previous comment

from torch-cam.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.