Git Product home page Git Product logo

Comments (7)

fzh0917 avatar fzh0917 commented on August 21, 2024 1

OK.I get it. However, I used some videos taken from the OTB dataset to test the model, and I found that the performances were better when not calling bgr2rgb function.
Well, maybe they are only some special samples.
Thanks very much for your patient help.

from pygoturn.

fzh0917 avatar fzh0917 commented on August 21, 2024

I am a non-English speaking person, so if you can't understand my meaning completely, please tell me immediately, I will try my best to express what I want to say. Thank you again!

from pygoturn.

fzh0917 avatar fzh0917 commented on August 21, 2024

I have made some tries, such as displaying bounding boxes and images variables by using OpenCV, printing coordinates, widths, heights and shapes...etc. However, After I do that, I have more confusions about those variables' meaning, which are not what I have understood. So, I need your help very much. Thank you!

from pygoturn.

fzh0917 avatar fzh0917 commented on August 21, 2024

Actually, The functions I can't understand are shift_crop_training_sample(sample, bb_params), crop_sample(sample), cropPadImage(bbox_tight, image) and computeCropPadImageLocation(bbox_tight, image).
I know a part of the logic of the function computeCropPadImageLocation(bbox_tight, image), but after reading those variables, such as left_half, right_half and top_half...etc, I have a question which is that it's necessary to write the logic so complicated?

from pygoturn.

amoudgl avatar amoudgl commented on August 21, 2024

Hi @fzh0917, I adapted most of these methods from PY-GOTURN repo. In my shift_crop_training_sample method, I randomly shift previous box location with smooth motion model as described in the paper and then return the shifted box location with respect to cropped image.

In order to preserve the location of shifted box in original image dimensions, I have params like edge_spacing_x, edge_spacing_y which gives me distance from x and y axis in original image etc. Since I directly used boundingbox.py from PY-GOTURN repo, I didn't change its nomenclature. search_region gives me the region (box + context padded image) from original image which was cropped without any resizing. This is useful since we do not maintain aspect ratio in GOTURN while resizing, hence for unscaling the network output, we need the original search region dimensions.

All these optional parameters [opts in helper.py] are only used while inference for doing in-place conversion of bounding box in the original image dimensions, see here. Network just returns the location of box with respect to resized crop. This exact same framework was followed in the original GOTURN and PY-GOTURN repositories.

Image cropping methods in helper.py, namely cropPadImage and computeCropPadImageLocation were directly copied from PY-GOTURN (see here). I followed this in order to exactly reproduce the GOTURN crop methodology, PY-GOTURN is a direct replication of GOTURN C++ code to Python. I tested them and they were working fine, so didn't bother to change them. :)

from pygoturn.

fzh0917 avatar fzh0917 commented on August 21, 2024

Hi, @amoudgl , thank you very much for your detailed explanations in the last week. I have understood all my questions now. However, there is a new question that comes to my mind today, which is that why we need change the orders of images' channels from B, G, R to R, G, B in preprocessing images? What is the purpose to do this? Thanks.

from pygoturn.

amoudgl avatar amoudgl commented on August 21, 2024

I am using a pytorch pretrained model which expects inputs to be RGB images with a specific mean and std: https://pytorch.org/docs/stable/torchvision/models.html

Thus, passing BGR images would not be appropriate and it leads to suboptimal results.

from pygoturn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.