Thanks for your released code for ViT-PCM. It's excellent work that shows the ability of ViT for WSSS. When I read your paper, I found no more details about the retrain method. Could you please tell me which pretrained weight you used in the experiments? Imagenet1k pretrained weight or COCO pretrained?
Thanks for your excellent work @rossettisimone. I am confused about some details, could you provide some explanation?
For the patch class mapping, the weight should be e*K, so how to obtain an extra background?
Also, in the ablation study, I wonder why Loss_et could contribute to the background.