jerryx1110 / rpcmvos Goto Github PK
View Code? Open in Web Editor NEW[AAAI22 Oral] Reliable Propagation-Correction Modulation for Video Object Segmentation
License: MIT License
[AAAI22 Oral] Reliable Propagation-Correction Modulation for Video Object Segmentation
License: MIT License
Hi there,
I am unsure if I loaded in the ResNet backbone properly. My output log returns:
Remove ['backbone.bn1.num_batches_tracked', 'backbone.layer1.0.bn1.num_batches_tracked', 'backbone.layer1.0.bn2.num_batches_tracked',...]
Load pretrained backbone model from /work3/s204161/BachelorData/vos_data/resnet101-deeplabv3p.pth.tar.
Start training.
Itr:0, LR:0.0000100, Time:213.254, Obj:1.4, S0: L 1.989(1.989) IoU 0.211(0.211), S1: L 2.101(2.101) IoU 0.192(0.192), S2: L 2.097(2.097) IoU 0.180(0.180)
....
I am not sure what remove backbone.bn1.num_batches_tracked means and I'm worried that it is checking that something is not in order and thus removing the pretrained backbone entirely. The whole list that it removes is listed further down below.
Thanks for your work
Remove ['backbone.bn1.num_batches_tracked', 'backbone.layer1.0.bn1.num_batches_tracked', 'backbone.layer1.0.bn2.num_batches_tracked', 'backbone.layer1.0.bn3.num_batches_tracked', 'backbone.layer1.0.downsample.1.num_batches_tracked', 'backbone.layer1.1.bn1.num_batches_tracked', 'backbone.layer1.1.bn2.num_batches_tracked', 'backbone.layer1.1.bn3.num_batches_tracked', 'backbone.layer1.2.bn1.num_batches_tracked', 'backbone.layer1.2.bn2.num_batches_tracked', 'backbone.layer1.2.bn3.num_batches_tracked', 'backbone.layer2.0.bn1.num_batches_tracked', 'backbone.layer2.0.bn2.num_batches_tracked', 'backbone.layer2.0.bn3.num_batches_tracked', 'backbone.layer2.0.downsample.1.num_batches_tracked', 'backbone.layer2.1.bn1.num_batches_tracked', 'backbone.layer2.1.bn2.num_batches_tracked', 'backbone.layer2.1.bn3.num_batches_tracked', 'backbone.layer2.2.bn1.num_batches_tracked', 'backbone.layer2.2.bn2.num_batches_tracked', 'backbone.layer2.2.bn3.num_batches_tracked', 'backbone.layer2.3.bn1.num_batches_tracked', 'backbone.layer2.3.bn2.num_batches_tracked', 'backbone.layer2.3.bn3.num_batches_tracked', 'backbone.layer3.0.bn1.num_batches_tracked', 'backbone.layer3.0.bn2.num_batches_tracked', 'backbone.layer3.0.bn3.num_batches_tracked', 'backbone.layer3.0.downsample.1.num_batches_tracked', 'backbone.layer3.1.bn1.num_batches_tracked', 'backbone.layer3.1.bn2.num_batches_tracked', 'backbone.layer3.1.bn3.num_batches_tracked', 'backbone.layer3.2.bn1.num_batches_tracked', 'backbone.layer3.2.bn2.num_batches_tracked', 'backbone.layer3.2.bn3.num_batches_tracked', 'backbone.layer3.3.bn1.num_batches_tracked', 'backbone.layer3.3.bn2.num_batches_tracked', 'backbone.layer3.3.bn3.num_batches_tracked', 'backbone.layer3.4.bn1.num_batches_tracked', 'backbone.layer3.4.bn2.num_batches_tracked', 'backbone.layer3.4.bn3.num_batches_tracked', 'backbone.layer3.5.bn1.num_batches_tracked', 'backbone.layer3.5.bn2.num_batches_tracked', 'backbone.layer3.5.bn3.num_batches_tracked', 'backbone.layer3.6.bn1.num_batches_tracked', 'backbone.layer3.6.bn2.num_batches_tracked', 'backbone.layer3.6.bn3.num_batches_tracked', 'backbone.layer3.7.bn1.num_batches_tracked', 'backbone.layer3.7.bn2.num_batches_tracked', 'backbone.layer3.7.bn3.num_batches_tracked', 'backbone.layer3.8.bn1.num_batches_tracked', 'backbone.layer3.8.bn2.num_batches_tracked', 'backbone.layer3.8.bn3.num_batches_tracked', 'backbone.layer3.9.bn1.num_batches_tracked', 'backbone.layer3.9.bn2.num_batches_tracked', 'backbone.layer3.9.bn3.num_batches_tracked', 'backbone.layer3.10.bn1.num_batches_tracked', 'backbone.layer3.10.bn2.num_batches_tracked', 'backbone.layer3.10.bn3.num_batches_tracked', 'backbone.layer3.11.bn1.num_batches_tracked', 'backbone.layer3.11.bn2.num_batches_tracked', 'backbone.layer3.11.bn3.num_batches_tracked', 'backbone.layer3.12.bn1.num_batches_tracked', 'backbone.layer3.12.bn2.num_batches_tracked', 'backbone.layer3.12.bn3.num_batches_tracked', 'backbone.layer3.13.bn1.num_batches_tracked', 'backbone.layer3.13.bn2.num_batches_tracked', 'backbone.layer3.13.bn3.num_batches_tracked', 'backbone.layer3.14.bn1.num_batches_tracked', 'backbone.layer3.14.bn2.num_batches_tracked', 'backbone.layer3.14.bn3.num_batches_tracked', 'backbone.layer3.15.bn1.num_batches_tracked', 'backbone.layer3.15.bn2.num_batches_tracked', 'backbone.layer3.15.bn3.num_batches_tracked', 'backbone.layer3.16.bn1.num_batches_tracked', 'backbone.layer3.16.bn2.num_batches_tracked', 'backbone.layer3.16.bn3.num_batches_tracked', 'backbone.layer3.17.bn1.num_batches_tracked', 'backbone.layer3.17.bn2.num_batches_tracked', 'backbone.layer3.17.bn3.num_batches_tracked', 'backbone.layer3.18.bn1.num_batches_tracked', 'backbone.layer3.18.bn2.num_batches_tracked', 'backbone.layer3.18.bn3.num_batches_tracked', 'backbone.layer3.19.bn1.num_batches_tracked', 'backbone.layer3.19.bn2.num_batches_tracked', 'backbone.layer3.19.bn3.num_batches_tracked', 'backbone.layer3.20.bn1.num_batches_tracked', 'backbone.layer3.20.bn2.num_batches_tracked', 'backbone.layer3.20.bn3.num_batches_tracked', 'backbone.layer3.21.bn1.num_batches_tracked', 'backbone.layer3.21.bn2.num_batches_tracked', 'backbone.layer3.21.bn3.num_batches_tracked', 'backbone.layer3.22.bn1.num_batches_tracked', 'backbone.layer3.22.bn2.num_batches_tracked', 'backbone.layer3.22.bn3.num_batches_tracked', 'backbone.layer4.0.bn1.num_batches_tracked', 'backbone.layer4.0.bn2.num_batches_tracked', 'backbone.layer4.0.bn3.num_batches_tracked', 'backbone.layer4.0.downsample.1.num_batches_tracked', 'backbone.layer4.1.bn1.num_batches_tracked', 'backbone.layer4.1.bn2.num_batches_tracked', 'backbone.layer4.1.bn3.num_batches_tracked', 'backbone.layer4.2.bn1.num_batches_tracked', 'backbone.layer4.2.bn2.num_batches_tracked', 'backbone.layer4.2.bn3.num_batches_tracked', 'aspp.aspp1.bn.num_batches_tracked', 'aspp.aspp2.bn.num_batches_tracked', 'aspp.aspp3.bn.num_batches_tracked', 'aspp.aspp4.bn.num_batches_tracked', 'aspp.global_avg_pool.2.num_batches_tracked', 'aspp.bn1.num_batches_tracked', 'decoder.bn1.num_batches_tracked', 'decoder.last_conv.1.num_batches_tracked', 'decoder.last_conv.5.num_batches_tracked', 'decoder.last_last_conv.weight', 'decoder.last_last_conv.bias'] from pretrained model.
I'm so sorry to bother you, I recently read your masterpiece. I am very interested in the calculation method in the formula for calculating information entropy in the Prediction reliability part of your paper. I would like to ask how the probability represented by the symbol P of the formula is calculated.
Thanks for your excellent work ! I have a detailed problem about the channel reweighting in two modulators.
In fig.2, I see both w_p and w_c are sent to both of the two modulators, but in your statement in "Modulator block" part, the reweighting operation is performed separately by the two vectors (i.e, w_p for propagation, w_c for correction). I am confused about this point.
hi, xiaohao,
Good work!
I am a master student at USTC, and i have some confusion about the code implementation, e.g., in matching.py, line 196, the distance is calculated with "(torch.sigmoid(nn_features_reshape + dis_bias.view(1, 1, 1, -1, 1)) - 0.5) * 2", according to the original paper, it should be "(0.5-torch.sigmoid(-(nn_features_reshape + dis_bias.view(1, 1, 1, -1, 1)))) * 2", why?
This is my email, [email protected]. Can you share me your wechat id, and i have some questions for advice.
Thanks~
Excuse me! Can you provide pre training weight files: resnet101-deeplabv3p.pth.tar
Hi, the pretrained weights file named "resnet101-deeplabv3p.pth.tar" for the backbone is missing.
Hi there,
I have downloaded and trained RPCMVOS on YT-VOS19 but when I try to evaluate custom testing images I get errors & I have narrowed it down to how the reference label is defined.
By adding a new if-statement in eval_manager_rpa.py using the existing YOUTUBE_VOS_Test() I am able to run RPCMVOS on YT-VOS19 examples and on my own images but only if I replace the respective annotated image with a dummy annotated image from YT-VOS19. I am not sure how the code expects the annotated format to be....
The error I get is:
Traceback (most recent call last):
File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/tools/eval_rpa.py", line 79, in <module>
main()
File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/tools/eval_rpa.py", line 76, in main
evaluator.evaluating()
File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/engine/eval_manager_rpa.py", line 235, in evaluating
all_pred, current_embedding,memory_cur_list = self.model.forward_for_eval(memory_prev_all_list[aug_idx], ref_emb,
File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/rpcm/rpcm.py", line 85, in forward_for_eval
tmp_dic, _ ,memory_cur_list= self.before_seghead_process(
File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/rpcm/rpcm.py", line 179, in before_seghead_process
global_matching_fg = global_matching_for_eval(
File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/layers/matching.py", line 282, in global_matching_for_eval
reference_labels_flat = reference_labels.view(-1, obj_nums)
RuntimeError: cannot reshape tensor of 0 elements into shape [-1, 0] because the unspecified dimension size -1 can be any value and is ambiguous
When I debug, reference_labels turns out to be an empty tensor. I have tried turning my annotated image into a RGB-image, which I am not sure I did correctly, but then I get this error:
main()
File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/tools/eval_rpa.py", line 76, in main
evaluator.evaluating()
File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/engine/eval_manager_rpa.py", line 235, in evaluating
all_pred, current_embedding,memory_cur_list = self.model.forward_for_eval(memory_prev_all_list[aug_idx], ref_emb,
File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/rpcm/rpcm.py", line 85, in forward_for_eval
tmp_dic, _ ,memory_cur_list= self.before_seghead_process(
File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/rpcm/rpcm.py", line 176, in before_seghead_process
seq_ref_frame_label = seq_ref_frame_label.squeeze(1).permute(1,2,0)
RuntimeError: permute(sparse_coo): number of dimensions in the tensor input does not match the length of the desired ordering of dimensions i.e. input.dim() = 4 is not equal to len(dims) = 3
My annotated image:
My annotated image after RGB:
This leads me to ask about how the annotated image format is expected to be?
Thanks for your attention.
Hello, is it convenient to provide the model weight file(save_step_400000.pth)? I want to try to inference the model and see the effect.
您好,仔细拜读了您的大作,有个问题还是没看明白想请教一下,请问随着迭代更新Reliable patch pool里面会保存所有过去帧的信息还是只保存第一帧和前一帧的信息?Reliable patch pool和基于stm的方法的momory read有什么区别?谢谢!
Hello, thank you very much for your great work. Can you provide your multi-scale test code?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.