Hi, Luca Bertinetto. I am doing the similar thing and just found that you've done a gr

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hi, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Suffering from problems while implementing the algorithm about siamese-fc HOT 8 OPEN

bertinetto commented on May 23, 2024

Suffering from problems while implementing the algorithm

from siamese-fc.

Comments (8)

hanjianglong commented on May 23, 2024 2

Hi, Luca Bertinetto.
When I use the download imdb.mat training, mention this wrong "Reference to non-existent field 'id'.". What should I do?

from siamese-fc.

bertinetto commented on May 23, 2024 2

Hi @LinHungShi,
Sorry for the very late answer but I took a break from my PhD to do an internship at the moment.

The concept of scale is related to the size of the object in the previous frame.
The update can be considered, for example: new_size = s*old_size . If s>1 then the object is increasing in size, if s<1 decreasing. At each frame we only search for 3 "scales", s=1, s=1.02 (or something similar) and s=1/1.02.
Not sure to understand what you are asking. Yes, all the images have been processed in the same way during data curation and we produced 2 crops of different size per frame. The procedure and the code are available in the ILSVRC15-curation folder.
The procedure to convert pixels in the response map to pixel in frame coordinates is detailed and documented in tracker_step.m

from siamese-fc.

bertinetto commented on May 23, 2024

Hi,
Apologies for the late answer: I was at CVPR + holidays.

Yep we do upsample the score map during tracking.
Yes we want to limit the stride of the network to avoid reducing the spatial resolution too much. Do you need to use a pretrained network with a large stride? You can try just to upsample the activations, or instead you can train a head of the network which performs upsampling.

from siamese-fc.

LinHungShi commented on May 23, 2024

Hi, thanks for reply. CVPR is really a great conference, hope you enjoyed your journey.
If no bother, I have a few more questions in regard to the paper that need your help.

In the paper, you talked about multiple scales, for example, "Multiple scales are searched in a single forward-pass by assembling a mini-batch of scaled images", "Tracking through scale space is achieved by processing several scaled versions of the search image. Any change in scale is penalized and updates of the current scale and damped" and "To handle scale variations, we also search for the object over five scales, and update the scale by linear interpolation with a factor of 0.35 to provide damping". I don't quite understand the meaning of "scale" in the context. Did you change the candidate/search image sizes? Could you explain the concept in more details?
My thought of the sizes of exemplar and candidate images is that you extract 127x127 and 256x256 patches from the image. However, in Data Curation, you did image scaling. Did you scale both exemplar and candidate images? Since only scale factor s is specified, it means the area of scaled image will be 127*127 (area of exemplar image), but the width and height might be different. Could you give me a general procedure on how you process the images?
After getting the score map, You upsample the it from 17x17 to 272x272. Since the candidate map is 256x256, which is smaller than the scaled score map, how do we know which score corresponds to which pixel ?

I'd really appreciate if you could give me some hints. Thanks.

from siamese-fc.

zsjerongdu commented on May 23, 2024

Hi, Luca Bertinetto.
I also meet the same problem as hanjainglong. When I use the download imdb.mat training, mention this wrong "Reference to non-existent field 'id'.". Besides, I don't understand the use of save_crops.m,what kind of crops does it generate and what's the use of these crops? Would you please give me a hint?

from siamese-fc.

shikongzxz commented on May 23, 2024

Hi, @bertinetto ,
I am afraid of that you have not clarified one of @LinHungShi's questions, i.e: the response map is 17x17, and the network's stride is 8, in your code disp_instanceInput = disp_instanceFinal * p.totalStride / p.responseUp, where the maximum value of disp_instanceInput is 68, which is much smaller than x_crop's half size 127, which means that object lies further than 68 pixels can not be detected.

Could you please explain this in detail, thanks?

from siamese-fc.

shikongzxz commented on May 23, 2024

I guess I have figured it out myself. Apologies if any bother.

from siamese-fc.

sysu-shey commented on May 23, 2024

Hi, Luca Bertinetto.
When I use the download imdb.mat training, mention this wrong "Reference to non-existent field 'id'.". What should I do?

from siamese-fc.

Suffering from problems while implementing the algorithm about siamese-fc HOT 8 OPEN

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent