jerryx1110 / rpcmvos Goto Github PK

View Code? Open in Web Editor NEW

81.0 7.0 11.0 404 KB

[AAAI22 Oral] Reliable Propagation-Correction Modulation for Video Object Segmentation

License: MIT License

Dockerfile 1.82% Python 97.60% Shell 0.57%

vos video computer-vision video-object-segmentation segmentation tracking pytorch aaai aaai2022 dense-prediction

rpcmvos's People

Contributors

Stargazers

Watchers

Forkers

kelvintao jcrong96 josh-zhu jblackrainz aistoy saliencydetection cv-seg vujas-eteph xxxxhh wolfworld6 djl1122

rpcmvos's Issues

About correctly loading in pretraining backbone

Hi there,

I am unsure if I loaded in the ResNet backbone properly. My output log returns:

Remove ['backbone.bn1.num_batches_tracked', 'backbone.layer1.0.bn1.num_batches_tracked', 'backbone.layer1.0.bn2.num_batches_tracked',...]     
Load pretrained backbone model from /work3/s204161/BachelorData/vos_data/resnet101-deeplabv3p.pth.tar.
Start training.
Itr:0, LR:0.0000100, Time:213.254, Obj:1.4, S0: L 1.989(1.989) IoU 0.211(0.211), S1: L 2.101(2.101) IoU 0.192(0.192), S2: L 2.097(2.097) IoU 0.180(0.180)
....

I am not sure what remove backbone.bn1.num_batches_tracked means and I'm worried that it is checking that something is not in order and thus removing the pretrained backbone entirely. The whole list that it removes is listed further down below.

Thanks for your work

Remove ['backbone.bn1.num_batches_tracked', 'backbone.layer1.0.bn1.num_batches_tracked', 'backbone.layer1.0.bn2.num_batches_tracked', 'backbone.layer1.0.bn3.num_batches_tracked', 'backbone.layer1.0.downsample.1.num_batches_tracked', 'backbone.layer1.1.bn1.num_batches_tracked', 'backbone.layer1.1.bn2.num_batches_tracked', 'backbone.layer1.1.bn3.num_batches_tracked', 'backbone.layer1.2.bn1.num_batches_tracked', 'backbone.layer1.2.bn2.num_batches_tracked', 'backbone.layer1.2.bn3.num_batches_tracked', 'backbone.layer2.0.bn1.num_batches_tracked', 'backbone.layer2.0.bn2.num_batches_tracked', 'backbone.layer2.0.bn3.num_batches_tracked', 'backbone.layer2.0.downsample.1.num_batches_tracked', 'backbone.layer2.1.bn1.num_batches_tracked', 'backbone.layer2.1.bn2.num_batches_tracked', 'backbone.layer2.1.bn3.num_batches_tracked', 'backbone.layer2.2.bn1.num_batches_tracked', 'backbone.layer2.2.bn2.num_batches_tracked', 'backbone.layer2.2.bn3.num_batches_tracked', 'backbone.layer2.3.bn1.num_batches_tracked', 'backbone.layer2.3.bn2.num_batches_tracked', 'backbone.layer2.3.bn3.num_batches_tracked', 'backbone.layer3.0.bn1.num_batches_tracked', 'backbone.layer3.0.bn2.num_batches_tracked', 'backbone.layer3.0.bn3.num_batches_tracked', 'backbone.layer3.0.downsample.1.num_batches_tracked', 'backbone.layer3.1.bn1.num_batches_tracked', 'backbone.layer3.1.bn2.num_batches_tracked', 'backbone.layer3.1.bn3.num_batches_tracked', 'backbone.layer3.2.bn1.num_batches_tracked', 'backbone.layer3.2.bn2.num_batches_tracked', 'backbone.layer3.2.bn3.num_batches_tracked', 'backbone.layer3.3.bn1.num_batches_tracked', 'backbone.layer3.3.bn2.num_batches_tracked', 'backbone.layer3.3.bn3.num_batches_tracked', 'backbone.layer3.4.bn1.num_batches_tracked', 'backbone.layer3.4.bn2.num_batches_tracked', 'backbone.layer3.4.bn3.num_batches_tracked', 'backbone.layer3.5.bn1.num_batches_tracked', 'backbone.layer3.5.bn2.num_batches_tracked', 'backbone.layer3.5.bn3.num_batches_tracked', 'backbone.layer3.6.bn1.num_batches_tracked', 'backbone.layer3.6.bn2.num_batches_tracked', 'backbone.layer3.6.bn3.num_batches_tracked', 'backbone.layer3.7.bn1.num_batches_tracked', 'backbone.layer3.7.bn2.num_batches_tracked', 'backbone.layer3.7.bn3.num_batches_tracked', 'backbone.layer3.8.bn1.num_batches_tracked', 'backbone.layer3.8.bn2.num_batches_tracked', 'backbone.layer3.8.bn3.num_batches_tracked', 'backbone.layer3.9.bn1.num_batches_tracked', 'backbone.layer3.9.bn2.num_batches_tracked', 'backbone.layer3.9.bn3.num_batches_tracked', 'backbone.layer3.10.bn1.num_batches_tracked', 'backbone.layer3.10.bn2.num_batches_tracked', 'backbone.layer3.10.bn3.num_batches_tracked', 'backbone.layer3.11.bn1.num_batches_tracked', 'backbone.layer3.11.bn2.num_batches_tracked', 'backbone.layer3.11.bn3.num_batches_tracked', 'backbone.layer3.12.bn1.num_batches_tracked', 'backbone.layer3.12.bn2.num_batches_tracked', 'backbone.layer3.12.bn3.num_batches_tracked', 'backbone.layer3.13.bn1.num_batches_tracked', 'backbone.layer3.13.bn2.num_batches_tracked', 'backbone.layer3.13.bn3.num_batches_tracked', 'backbone.layer3.14.bn1.num_batches_tracked', 'backbone.layer3.14.bn2.num_batches_tracked', 'backbone.layer3.14.bn3.num_batches_tracked', 'backbone.layer3.15.bn1.num_batches_tracked', 'backbone.layer3.15.bn2.num_batches_tracked', 'backbone.layer3.15.bn3.num_batches_tracked', 'backbone.layer3.16.bn1.num_batches_tracked', 'backbone.layer3.16.bn2.num_batches_tracked', 'backbone.layer3.16.bn3.num_batches_tracked', 'backbone.layer3.17.bn1.num_batches_tracked', 'backbone.layer3.17.bn2.num_batches_tracked', 'backbone.layer3.17.bn3.num_batches_tracked', 'backbone.layer3.18.bn1.num_batches_tracked', 'backbone.layer3.18.bn2.num_batches_tracked', 'backbone.layer3.18.bn3.num_batches_tracked', 'backbone.layer3.19.bn1.num_batches_tracked', 'backbone.layer3.19.bn2.num_batches_tracked', 'backbone.layer3.19.bn3.num_batches_tracked', 'backbone.layer3.20.bn1.num_batches_tracked', 'backbone.layer3.20.bn2.num_batches_tracked', 'backbone.layer3.20.bn3.num_batches_tracked', 'backbone.layer3.21.bn1.num_batches_tracked', 'backbone.layer3.21.bn2.num_batches_tracked', 'backbone.layer3.21.bn3.num_batches_tracked', 'backbone.layer3.22.bn1.num_batches_tracked', 'backbone.layer3.22.bn2.num_batches_tracked', 'backbone.layer3.22.bn3.num_batches_tracked', 'backbone.layer4.0.bn1.num_batches_tracked', 'backbone.layer4.0.bn2.num_batches_tracked', 'backbone.layer4.0.bn3.num_batches_tracked', 'backbone.layer4.0.downsample.1.num_batches_tracked', 'backbone.layer4.1.bn1.num_batches_tracked', 'backbone.layer4.1.bn2.num_batches_tracked', 'backbone.layer4.1.bn3.num_batches_tracked', 'backbone.layer4.2.bn1.num_batches_tracked', 'backbone.layer4.2.bn2.num_batches_tracked', 'backbone.layer4.2.bn3.num_batches_tracked', 'aspp.aspp1.bn.num_batches_tracked', 'aspp.aspp2.bn.num_batches_tracked', 'aspp.aspp3.bn.num_batches_tracked', 'aspp.aspp4.bn.num_batches_tracked', 'aspp.global_avg_pool.2.num_batches_tracked', 'aspp.bn1.num_batches_tracked', 'decoder.bn1.num_batches_tracked', 'decoder.last_conv.1.num_batches_tracked', 'decoder.last_conv.5.num_batches_tracked', 'decoder.last_last_conv.weight', 'decoder.last_last_conv.bias'] from pretrained model.

How Can I setup dataset and .pth path?

I git clone this git page and download datasets and .pth file suessful, and then I want to know how to setup dataset and pth path.
I got a path error.

about Shannon entropy

I'm so sorry to bother you, I recently read your masterpiece. I am very interested in the calculation method in the formula for calculating information entropy in the Prediction reliability part of your paper. I would like to ask how the probability represented by the symbol P of the formula is calculated.

about the reweighting operation in two modulators

Thanks for your excellent work ! I have a detailed problem about the channel reweighting in two modulators.
In fig.2, I see both w_p and w_c are sent to both of the two modulators, but in your statement in "Modulator block" part, the reweighting operation is performed separately by the two vectors (i.e, w_p for propagation, w_c for correction). I am confused about this point.

about the code implementation of distance calculation for matching

hi, xiaohao,

Good work!
I am a master student at USTC, and i have some confusion about the code implementation, e.g., in matching.py, line 196, the distance is calculated with "(torch.sigmoid(nn_features_reshape + dis_bias.view(1, 1, 1, -1, 1)) - 0.5) * 2", according to the original paper, it should be "(0.5-torch.sigmoid(-(nn_features_reshape + dis_bias.view(1, 1, 1, -1, 1)))) * 2", why?
This is my email, [email protected]. Can you share me your wechat id, and i have some questions for advice.

Thanks~

Help

Excuse me! Can you provide pre training weight files: resnet101-deeplabv3p.pth.tar

pretrained weights for the backbone

Hi, the pretrained weights file named "resnet101-deeplabv3p.pth.tar" for the backbone is missing.

Custom testing images - empty reference labels

Hi there,

I have downloaded and trained RPCMVOS on YT-VOS19 but when I try to evaluate custom testing images I get errors & I have narrowed it down to how the reference label is defined.

By adding a new if-statement in eval_manager_rpa.py using the existing YOUTUBE_VOS_Test() I am able to run RPCMVOS on YT-VOS19 examples and on my own images but only if I replace the respective annotated image with a dummy annotated image from YT-VOS19. I am not sure how the code expects the annotated format to be....

The error I get is:

Traceback (most recent call last):
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/tools/eval_rpa.py", line 79, in <module>
    main()
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/tools/eval_rpa.py", line 76, in main
    evaluator.evaluating()
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/engine/eval_manager_rpa.py", line 235, in evaluating
    all_pred, current_embedding,memory_cur_list = self.model.forward_for_eval(memory_prev_all_list[aug_idx], ref_emb, 
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/rpcm/rpcm.py", line 85, in forward_for_eval
    tmp_dic, _ ,memory_cur_list= self.before_seghead_process(
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/rpcm/rpcm.py", line 179, in before_seghead_process
    global_matching_fg = global_matching_for_eval(
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/layers/matching.py", line 282, in global_matching_for_eval
    reference_labels_flat = reference_labels.view(-1, obj_nums)
RuntimeError: cannot reshape tensor of 0 elements into shape [-1, 0] because the unspecified dimension size -1 can be any value and is ambiguous

When I debug, reference_labels turns out to be an empty tensor. I have tried turning my annotated image into a RGB-image, which I am not sure I did correctly, but then I get this error:

    main()
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/tools/eval_rpa.py", line 76, in main
    evaluator.evaluating()
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/engine/eval_manager_rpa.py", line 235, in evaluating
    all_pred, current_embedding,memory_cur_list = self.model.forward_for_eval(memory_prev_all_list[aug_idx], ref_emb, 
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/rpcm/rpcm.py", line 85, in forward_for_eval
    tmp_dic, _ ,memory_cur_list= self.before_seghead_process(
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/rpcm/rpcm.py", line 176, in before_seghead_process
    seq_ref_frame_label = seq_ref_frame_label.squeeze(1).permute(1,2,0)
RuntimeError: permute(sparse_coo): number of dimensions in the tensor input does not match the length of the desired ordering of dimensions i.e. input.dim() = 4 is not equal to len(dims) = 3

My annotated image:

My annotated image after RGB:

This leads me to ask about how the annotated image format is expected to be?

Thanks for your attention.

jerryx1110 / rpcmvos Goto Github PK

rpcmvos's People

Contributors

Stargazers

Watchers

Forkers

rpcmvos's Issues

About correctly loading in pretraining backbone

How Can I setup dataset and .pth path?

about Shannon entropy

about the reweighting operation in two modulators

about the code implementation of distance calculation for matching

Help

pretrained weights for the backbone

Custom testing images - empty reference labels

model inference

请问Reliable patch pool里面会保存所有过去帧的信息还是只保存第一帧和前一帧的信息？

About multi-scale testing

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent