tfzhou / contrastiveseg Goto Github PK
View Code? Open in Web Editor NEWICCV2021 (Oral) - Exploring Cross-Image Pixel Contrast for Semantic Segmentation
License: MIT License
ICCV2021 (Oral) - Exploring Cross-Image Pixel Contrast for Semantic Segmentation
License: MIT License
Hi Zhou,
I have read your paper and found it was absolutely amazing. Great work!
However, I felt a bit loss when trying read your code. Could you please briefly explain how "_dequeue_and_enqueue" function works in train_contrastive.py?
In general,
Thank you
Hello, I tried to run contrastive seg with memory bank, but the code seems that it cannot connect to loss_contrast_mem.py and I cannot find a suitable seg_net to conduct in trainer_contrastive.py.
Look forward to your help.
Hi @tfzhou
I'm studying your repo for a couple of months from time to time. I'm implementing something different. However, I need to change the loss function and I'm not sure if it will be saved during the training. I couldn't find the lines where the loss function is saved. In init functions, the weights are loaded from configer, but I couldn't find where they are saved. At each epoch, the losses should be saved right? Do I think correctly?
Hello. Thanks for your great work.
I run the script cityscapes/hrnet/run_h_48_d_4_contrast_mem.sh, but the it turns out that the result is worse than baseline.
There may be some bugs in the implement of memory, such as pixel_queue_ptr[lb] = (pixel_queue_ptr[lb] + 1) % self.memory_size
in line 138, trainer_contrastive.py. I also find that the semi-hard example sampling is not implemented in your code.
I think these may be the reasons that I can't reproduce your results. Would your provdie an updated version of the implement of memory bank?
Looking forword to your reply!
When you provide pretrained model?
Thank you very much for your excellent work, I have a few questions to consult you.
What is the function of this parameter network_stride, and why do you perform this operation on the labels?
I want to know what are the dimensions of the two parameters(keys, labels) passed by the function _dequeue_and_enqueue, and are they the same?
Can your code handle segmentation tasks with labels starting from 0? I didn't understand the statement this_label_ids = [x for x in this_label_ids if x > 0] in the function _dequeue_and_enqueue
Thank you very much for sharing.There is a place in the code that I don't understand. Could you please tell me,
In loss_contrast.py what is meanning of the _hard_anchor_sampling.Would you please explain its use to me
Hello, there are too many files in the project, I would like to ask you if I just want to try to reproduce the simple results of the camvid dataset combined with the pixel contrast learning method, what should I go to see about model creation, loss function, What about the data preprocessing, and the .py files and .sh files for training and testing?
I hope you can take the time to give me some simple guidelines, which will benefit me a lot, thank you!
Exploring Cross-Image Pixel Contrast for Semantic Segmentation
I don't find your 10% hardest sampling strategy implement. Your memory bank only randomly stores K positive and negative pairs.
您好,非常感谢您的贡献,请问您能把 Figure 5 可视化的代码给我嘛,谢谢您
Thank you for your code. But I can not find the implementations of memory bank. How do you realize the contrastive loss?
Hello! Your work is outstanding!
I did a detailed research on both the paper and the code.
There are a lot of questions that I don’t understand:
Hi,
Thanks for your great work! I am wondering how you select feature embeddings for tSNE visualisation? Because for the dense pixel-level segmentation task, if we use feature embeddings of all the pixels, it will be too much.
Many thanks!
.....
2022-01-13 06:47:40,820 INFO [trainer_contrastive.py, 318] 60 images processed
2022-01-13 06:47:40,894 INFO [trainer_contrastive.py, 318] 60 images processed
2022-01-13 06:47:41,245 INFO [trainer_contrastive.py, 318] 60 images processed
2022-01-13 06:47:41,654 INFO [trainer_contrastive.py, 318] 60 images processed
2022-01-13 06:47:45,347 INFO [base.py, 84] Performance 0.0 -> 0.38793964487753546
2022-01-13 06:47:49,701 INFO [trainer_contrastive.py, 394] Test Time 50.082s, (0.795) Loss 0.64488318
2022-01-13 06:47:49,702 INFO [base.py, 33] Result for seg
2022-01-13 06:47:49,702 INFO [base.py, 49] Mean IOU: 0.38793964487753546
2022-01-13 06:47:49,702 INFO [base.py, 50] Pixel ACC: 0.8894303367559715
.......
2022-01-13 06:58:05,453 INFO [trainer_contrastive.py, 283] Train Epoch: 2 Train Iteration: 1930 Time 6.409s / 10iters, (0.641) Forward Time 2.593s / 10iters, (0.259) Backward Time 3.720s / 10iters, (0.372) Loss Time 0.081s / 10iters, (0.008) Data load 0.014s / 10iters, (0.001401)
Learning rate = [0.00956490946871452, 0.00956490946871452] Loss = 0.31870443 (ave = 0.43296322)
the config details about my experiment is:
(contrastSeg) xxx@ubuntu:~/paper_code/ContrastiveSeg-main$ bash scripts/cityscapes/hrnet/run_h_48_d_4_ocr_contrast.sh train 4
Logging to /home/xxx/paper_code/ContrastiveSeg-main//logs/Cityscapes/hrnet_w48_ocr_contrast_lr1x_ocr_contrast_40k.log
World size: 4
2022-01-13 06:35:32,252 INFO [distributed.py, 49] ['main_contrastive.py', '--configs', 'configs/cityscapes/H_48_D_4.json', '--drop_last', 'y', '--phase', 'train', '--gathered', 'n', '--loss_balance', 'y', '--log_to_file', 'n', '--backbone', 'hrnet48', '--model_name', 'hrnet_w48_ocr_contrast', '--data_dir', '/home/xxx/paper_code/ContrastiveSeg-main/dataset//Cityscapes', '--loss_type', 'contrast_auxce_loss', '--gpu', '0', '1', '2', '3', '--max_iters', '40000', '--max_epoch', '500', '--checkpoints_root', '/home/xxx/paper_code/ContrastiveSeg-main/Checkpoints/hrnet_w48_ocr_contrast/Cityscapes/', '--checkpoints_name', 'hrnet_w48_ocr_contrast_lr1x_ocr_contrast_40k', '--pretrained', '/home/xxx/paper_code/ContrastiveSeg-main/pretrained_model/hrnetv2_w48_imagenet_pretrained.pth', '--train_batch_size', '4', '--distributed', '--base_lr', '0.01']
['--configs', 'configs/cityscapes/H_48_D_4.json', '--drop_last', 'y', '--phase', 'train', '--gathered', 'n', '--loss_balance', 'y', '--log_to_file', 'n', '--backbone', 'hrnet48', '--model_name', 'hrnet_w48_ocr_contrast', '--data_dir', '/home/xxx/paper_code/ContrastiveSeg-main/dataset//Cityscapes', '--loss_type', 'contrast_auxce_loss', '--gpu', '0', '1', '2', '3', '--max_iters', '40000', '--max_epoch', '500', '--checkpoints_root', '/home/xxx/paper_code/ContrastiveSeg-main/Checkpoints/hrnet_w48_ocr_contrast/Cityscapes/', '--checkpoints_name', 'hrnet_w48_ocr_contrast_lr1x_ocr_contrast_40k', '--pretrained', '/home/xxx/paper_code/ContrastiveSeg-main/pretrained_model/hrnetv2_w48_imagenet_pretrained.pth', '--train_batch_size', '4', '--distributed', '--base_lr', '0.01']
i probably knew the problem was with the loading of the dataset, but failed to find the concrete problem. what's more, this error is unstablely occurred at the same iteration step!! what an amazing problem, so, i want to find some help from the author for a details about data format and a simple demo for running the .sh file. Thank you so much and waiting for your reply.
Hi, Dr.Zhou.
Sorry to bother you, but could you please give an example to conduct the command file?
Considering various version of bash files, it would be nice to give an description about your naming style.
For example, (1) "run_ideal_spatial_ocrnet.sh", "xxx_ideal_gather_ocrnet.sh", and "xxx_ideal_distribute_octnet.sh"
(2) "run_ideal_spatial_ocrnet.sh", "run_ideal_spatial_ocrnet_b.sh", and "run_ideal_spatial_ocrnet_c.sh"
Thank you, and looking forward to your reponse.
Hi,
First congrats for acceptance of ICCV'21.
I have question about result of ResNet-101 + DeepLab-V3 benchmark.
From your table and log it shows mIoU is 72.75
with pretrained model. However, basical benchmark of ResNet-101 is 77.12
and ResNet-50 is 79.09
(both using pretrained model, either).
Could you explain why ResNet-101 + DeepLab-V3 baseline performs so bad in your setting?
Best,
Hello, Nice work.
I am curious about how to visualize embeddings on a 2D plot with t-sne, and where the embeddings come from.
The 1st question is about the embedding collection. Specifically, there are two ways collecting embeddings from images(or feature maps).
Next question is that the embeddings collected above come from the output of last conv layer before the classifier or the MLP projection head?
use 3090 to train ,but out of memory,the datasets is cityscapes .how to solve this problems
How do I run inference on my own images?
Thanks,
Hi, thank you for the great work. I am currently using your pipeline on my dataset. I have a question about the hard anchor sampling function. As the label (128 x 128) is resampled to the feature size (16 x 16), the segmentation prediction from the model is not resampled and it pops up an error of different dimensions between labels and predictions in the following line:
hard_indices = ((this_y_hat == cls_id) & (this_y != cls_id)).nonzero()
Is there anything that I can do to fix the error?
Thank you so much!
Hi, Dr. Zhou,
Thanks for releasing the code. When reading the code about the contrastive loss in function _contrastive()
, a mask
is computed by following two lines:
ContrastiveSeg/lib/loss/loss_contrast_mem.py
Line 124 in 2ab84d8
ContrastiveSeg/lib/loss/loss_contrast_mem.py
Line 131 in 2ab84d8
Now I think the shape of the mask
is [anchor_num * anchor_count, class_num * cache_size]
. If I did not misunderstand the code, the mask
is a 'positive' mask, and each line represents the positive samples of an anchor view.
Then in L134-L138, the function of logits_mask
is confusing:
ContrastiveSeg/lib/loss/loss_contrast_mem.py
Lines 134 to 138 in 2ab84d8
Suppose I have anchor_num=6 (2 images, 3 valid classes per image), anchor_count=2 (sample two pixels per class), class_num=5 (class number), cache_size=2 (memory size), then the following code raises RuntimeError:
mask = torch.ones((6 * 2, 5 * 2)).scatter_(1, torch.arange(6 * 2).view(-1, 1), 0)
Output:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: index 10 is out of bounds for dimension 1 with size 10
on line 30
ContrastiveSeg/lib/loss/loss_contrast.py
Line 30 in 2ab84d8
on line 144
ContrastiveSeg/lib/loss/loss_contrast.py
Line 144 in 2ab84d8
Should the y_hat actually be prediction from model in stead of target from label?
Thank you for your excellent work, Dr. Zhou.
I am a novice postgraduate, and my coding ability is not high. I want to use your method on my own two-class segmentation model. Which py files should I read or what should I pay attention to?
Looking forward to your answer, thank you very much.
Hello, Dr. Zhou, I have not seen the application of memory bank in the code base?
Hi, Thanks for your great work.
Could you help explain what is the n_view stands for?
ContrastiveSeg/lib/loss/loss_contrast.py
Lines 47 to 51 in 3101207
I understand the meaning of the total_classes and feat_dim, but it's difficult for me to understand the meaning of n_view here. Thanks.
Hi, Thanks for your nice work.
I have some questions about T-SNE visualization.
Hi, thanks for the great work!
However, I found that the contrastive loss here seems to be different from Eqn. (3) in the paper. According to the paper, the denominator should be 1 specific positive pair plus all the negative pairs. But in the current implementation, the denominator will be all the positive pairs (excluding identical matching) plus all the negative pairs.
Hi,sir. Thank you for release the great code. I try to use your model in my dataset. It is about optic cup and disc segmentation.The dataset has 3 classes. I rewrite .sh
and .json
:,"num_classes": 3, "label_list": [1,2,3],
Here are two errors,can you help ?Thank u!
RuntimeError: weight tensor should be defined either for all or no classes Traceback (most recent call last): File "/data/fyw/ContrastiveOpticSegmentation/main_contrastive.py", line 236, in <module> model.train() File "/data/fyw/ContrastiveOpticSegmentation/segmentor/trainer_contrastive.py", line 420, in train self.__train() File "/data/fyw/ContrastiveOpticSegmentation/segmentor/trainer_contrastive.py", line 241, in __train loss = self.pixel_loss(outputs, targets, with_embed=with_embed) File "/data/anaconda3/envs/ContrastiveOpticSegmentation/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/data/fyw/ContrastiveOpticSegmentation/lib/loss/loss_contrast_mem.py", line 218, in forward loss = self.seg_criterion(pred, target) File "/data/anaconda3/envs/ContrastiveOpticSegmentation/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/data/fyw/ContrastiveOpticSegmentation/lib/loss/loss_helper.py", line 204, in forward loss = self.ce_loss(inputs, target) File "/data/anaconda3/envs/ContrastiveOpticSegmentation/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/data/anaconda3/envs/ContrastiveOpticSegmentation/lib/python3.8/site-packages/torch/nn/modules/loss.py", line 1150, in forward return F.cross_entropy(input, target, weight=self.weight, File "/data/anaconda3/envs/ContrastiveOpticSegmentation/lib/python3.8/site-packages/torch/nn/functional.py", line 2846, in cross_entropy return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
2.Traceback (most recent call last): File "main_contrastive.py", line 185, in <module> handle_distributed(args_parser, os.path.expanduser(os.path.abspath(__file__))) File "/data/fyw/ContrastiveOpticSegmentation/lib/utils/distributed.py", line 70, in handle_distributed raise subprocess.CalledProcessError(returncode=process.returncode, subprocess.CalledProcessError: Command '['/data/anaconda3/envs/ContrastiveOpticSegmentation/bin/python', '-u', '-m', 'torch.distributed.launch', '--nproc_per_node', '4', '--master_port', '29962', '/data/fyw/ContrastiveOpticSegmentation/main_contrastive.py', '--configs', 'configs/REFUGE/H_48_D_4_MEM.json', '--drop_last', 'y', '--phase', 'train', '--gathered', 'n', '--loss_balance', 'y', '--log_to_file', 'n', '--backbone', 'hrnet48', '--model_name', 'hrnet_w48_mem', '--gpu', '4', '5', '6', '7', '--data_dir', '/data/dataset/REFUGE', '--loss_type', 'mem_contrast_ce_loss', '--max_iters', '40000', '--train_batch_size', '8', '--checkpoints_root', '/data/fyw/ContrastiveOpticSegmentation/Model/REFUGE/', '--checkpoints_name', 'hrnet_w48_mem_paddle_lr2x_1', '--pretrained', '/data/dataset/hrnetv2_w48_imagenet_pretrained.pth', '--distributed', '--base_lr', '0.01']' returned non-zero exit status 1.
Hi, thank you for the amazing work!
Your paper and concept are really interesting.
I have trouble understanding your contrastive loss function in here.
What should be the input of this _contrastive
function?
In other words, would you let me know
feats_
and labels_
andnumber of anchors X number of views
Hello,
I have one problem at the code pixel_queue_ptr[lb] = (pixel_queue_ptr[lb] + 1) % self.memory_size
in the line 138, _dequeue_and_enqueue
function, trainer_contrastive.py file. Should pixel_queue_ptr[lb] + 1
be modified to pixel_queue_ptr[lb] + K
? Otherwise, pixel_queue[lb, ptr + 1:ptr + 1 + K, :]
will be assigned at the next iteration, which is overlapped with pixel_queue[lb, ptr:ptr + K, :]
.
Thanks for your nice work!
Repo based on pytorch==0.4.x ?
And also, how about the FLOPs (computation) of your method?
like 1.2.0 instead of 1.7.1?
thanks!
Hi, thanks for your great work. Will you update the code for the memory bank?
hi, how do you use tsne in a segmentation network to have such visualization?
thanks
似乎有一篇论文Region-aware Contrastive Learning for Semantic Segmentation疑似抄袭你们的成果
Thank you very much for your work, I have a question: my model is BiSeNetV2, I want to replace my loss function with your pixel contrast loss function, but my output has only one value, and the preds of the pixel contrast loss function is a dictionary, and the keys are 'seg' and 'embed', I don't know what 'seg' and 'embed' mean, and I don't know how to get these two values in BiSeNetV2?
class ContrastCELoss(nn.Module, ABC):
def forward(self, preds, target, with_embed=False):
h, w = target.size(1), target.size(2)
assert "seg" in preds
assert "embed" in preds
seg = preds['seg']
embedding = preds['embed']
Not able to run train file. Will please share the tips to run coco script step wise
ContrastiveSeg/lib/loss/loss_contrast.py
Line 106 in 2ab84d8
Would you mind explain why is it needed to subtract max of the inner product from each of the inner product per anchor?
Hi, thanks for your outstanding work.
I have a question when researching your code.
Why ignore it when class id equals zero?
Hi, well done on the nice work.
I have the following questions regarding the codebase:
if ptr + K >= self.memory_size:
pixel_queue[lb, -K:, :] = nn.functional.normalize(feat,p=2, dim=1)
pixel_queue_ptr[lb] = 0
The only thing I found related to getting negatives from the memory is in loss_contrast_mem.py, method _sample_negative(self, Q) which to the best of my understanding just takes all memory samples per class.
Please correct me if I am missing something here.
If not, are you planning to either add these methods or make trained model weights available?
It would be very helpful if you could address these questions.
Thanks
Hello. Thanks for your excellent work!
I transplant the loss function and memory bank of your code to my own code, and run my code on Cityscapes dataset and my own dataset. But it turns out that the decrease of the contrast loss is very small. For example, the contrast loss drops from 1.27 to 1.11 after 80k epoch on Cityscapes, while the ce loss drops from 1.26 to 0.15. And the same thing happens on my own dataset. It seems that the contrast loss is not being useful during train.
I wonder is it normal for the contrast loss to decrease so little? And what can I do to make full use of the contrast loss, like tuning the hyper-parameters?
Thanks for your wonderful work on contrast learning for semantic segmentation, and i want to follow your work.
Dear @tfzhou ,
I find your work really interesting - thanks a lot (also to your team).
In your paper you state that you use an
Using the HRNet backbone I need to set the full_res_stem
env variable to get an embedding of the same dimension as the input image. Else, there is an interpolation logic to upsample the feature-embeddings, which seems to happen for any use of the ResNet backbone.
I was wondering if you use the full_res_stem
env variable for your stated results? Also, did you observe a performance difference when the feature embedding dimensions < input dimensions?
Thanks
Oliver
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.