Git Product home page Git Product logo

Comments (11)

IssamLaradji avatar IssamLaradji commented on July 30, 2024 2

The __getitem__ function in the dataset loaders such as in datasets/trancos.py shows you what LCFCN and its loss expect. They expect the following items:

 return {"images":image, "points":points, 
                "counts":counts, "index":index,
                "image_path":self.path + name + ".jpg"}

where images is an RGB with shape (1,3,H,W), points is a matrix with a single point for each object and has the shape (1,H,W), counts has the count for each category with shape (1, K), index is the image id; also,

H: is the image height
W: is the image width
K: is the number of classes

You can create a file like trancos.py for your dataset and then load it for training. Let me know if you need help in this part. Cheers!

from lcfcn.

IssamLaradji avatar IssamLaradji commented on July 30, 2024 2
  1. You are right, you don't need regions of interests for your training images;
  2. The image_sets specify which image files are for training, validation, and testing. So for training, you only load the images mentioned in image_sets/train.txt from images/

from lcfcn.

IssamLaradji avatar IssamLaradji commented on July 30, 2024 1

Happy to help!

When you say a single point for each object... does that mean something like the center point of each object ?

Yes, you can take the center of the object as a single point, just like the example you showed. The value of the point represents the class of the object.

from lcfcn.

IssamLaradji avatar IssamLaradji commented on July 30, 2024 1
  1. the .mat files are binary matrices that represent the regions of interest in the image. Not all datasets have that, for example, shanghai.py doesn't have that.

  2. the .txt files contain the paths to the images which you use to load image at every iteration by calling __getitem__

  3. the transform function are used to flip, rotate, or/and normalize the image. normalization is important if you are using a pretrained network like resnet which expects a specific kind of input distribution.

from lcfcn.

IssamLaradji avatar IssamLaradji commented on July 30, 2024 1

you are welcome! you are free to open another issue where i can explain each part of the loss and/or architecture for you.

I don't think there is another source yet, but I am planning to create a blog post on this at some point. Sorry :(

from lcfcn.

gauthsvenkat avatar gauthsvenkat commented on July 30, 2024

First of all, Thanks a ton for the fast reply!

I'll explore the file and try to reverse engineer it as much a possible. I've never worked with pytorch so this is pretty new to me.

When you say a single point for each object... does that mean something like the center point of each object ? For example

0 0 0 0
0 1 0 0
0 0 0 0
0 0 0 0

would mean that the 1 corresponds to the center of an object ?

If that's the case I have the four coordinates for each object (I annotated them because I tried to solve it as an object detection challenge), I could just take the centroid right ?

from lcfcn.

gauthsvenkat avatar gauthsvenkat commented on July 30, 2024

Okay I've seen the trancos.py file and mostly understand what's happening.

  1. What are the .mat files that are being loaded ?
  2. Also what are the .txt files present in the images directory ? (I'm looking at the TRANCOS dataset)
  3. What exactly is the transform function ?

from lcfcn.

gauthsvenkat avatar gauthsvenkat commented on July 30, 2024
  1. So I don't necessarily need to have regions of interests included in my training images right ?

  2. I get the .txt files in image_sets/*.txt, but the .txt files in images/ which have the same name as that of the image. They have some numbers in them.

from lcfcn.

gauthsvenkat avatar gauthsvenkat commented on July 30, 2024

Man, the last few days I've been breaking my head over this. I don't exactly "get" the loss function (All 4 losses) or how you implemented it in torch. I was hoping if I get the loss function I could write it in keras (which I'm comfortable with). Is there maybe another source (like a blog post or an article) that explains how you practically implemented the loss (and the entire model in general) ?

Thanks a ton for helping out!

(I'm also closing this issue, since you did solve the actual issue)

from lcfcn.

tongpinmo avatar tongpinmo commented on July 30, 2024
  1. You are right, you don't need regions of interests for your training images;
  2. The image_sets specify which image files are for training, validation, and testing. So for training, you only load the images mentioned in image_sets/train.txt from images/
     
    I have images and the points files ,should I produce the dots.png and .mat files ?

from lcfcn.

tongpinmo avatar tongpinmo commented on July 30, 2024
  1. the .mat files are binary matrices that represent the regions of interest in the image. Not all datasets have that, for example, shanghai.py doesn't have that.
  2. the .txt files contain the paths to the images which you use to load image at every iteration by calling __getitem__
  3. the transform function are used to flip, rotate, or/and normalize the image. normalization is important if you are using a pretrained network like resnet which expects a specific kind of input distribution.

Actually ,in shanghai.py , line 45, there are .mat files?So ,what's the meaning of shanghai.py doesn't have that as you mentioned

from lcfcn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.