Git Product home page Git Product logo

hkchengrex / mask-propagation Goto Github PK

View Code? Open in Web Editor NEW
126.0 126.0 22.0 721 KB

[CVPR 2021] MiVOS - Mask Propagation module. Reproduced STM (and better) with training code :star2:. Semi-supervised video object segmentation evaluation.

Home Page: https://hkchengrex.github.io/MiVOS/

License: MIT License

Python 100.00%
computer-vision cvpr2021 deep-learning pytorch segmentation video-object-segmentation video-segmentation

mask-propagation's People

Contributors

hkchengrex avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

mask-propagation's Issues

How to install thinplate?

Hi, thanks for your great work. How to install thinplate?
/Mask-Propagation-main/dataset/tps.py", line 4, in
import thinplate as tps
ModuleNotFoundError: No module named 'thinplate'

metrics results of test dataset

After I run the code eval_davis_2016.py, I only get the mask file in the output file. how could I get the value of metrics such as J, J&F? and how could we test the model on personal datasets to get those metrics after using interactive_gui.py?
Thanks for your suggestions

Does the kernelized memory need training?

I use the kernelized memory when evaluate STCN while did not use it when training. But the result showed a slight decrease. The raw J&F-Mean of davis2016val is 0.916, davis2017val is 0.853and davis2017testdev is 0.755 . After I use the kernelized memory, J&F-Mean of davis2016val is 0.913, davis2017val is 0.852 and davis2017testdev is 0.750 . Does it because the kernelized memory need training? But why it need training since it has no trainable parameters?

J&F performance on BL30K

Hi, I am doing BL30K training for DAVIS 2017 val (including stage 0 and stage 1). I just want to know what J&F should I achieve on the DAVIS 2017 val after finishing BL30K training? Therefore, I can check whether my training is correct. I think it did not included in readme.

RuntimeError: Error(s) in loading state_dict for PropagationNetwork

Hello ! I want to train the PropagationNetwork on my personal image dataset, so I use the training command CUDA_VISIBLE_DEVICES=0,1 OMP_NUM_THREADS=4 python -m torch.distributed.launch --master_port 9842 --nproc_per_node=2 train.py --id retrain_s01 --load_network ./saves/propagation_model.pth --stage 0.(based on the pretrain model S012). It threw a runtime error.

loadnetwork_error
The training command works fine without the --load network parameters. Could you give me some suggestions?

about BL30K

您好,能把这个数据集BL30K传到百度网盘吗?BL30K这个数据集太大了,谷歌网盘有下载容量的限制,而且下载不稳定老是断。

How to run two copies of your code at the same time?

image

I have duplicated two copies of your code and made small changes in the duplicated code respectively. When one is being trained, the other one cannot be trained. If the two codes are trained at the same time, what parameters need to be changed?One of my computers has 4 2080ti, the memory is enough.

thin-plate-spline question

Hey,

I'm the author of https://github.com/cheind/py-thin-plate-spline and discovered your dependency here. I've tracked its usage down to

theta = tps.tps_theta_from_points(c_src, c_dst, reduced=True)

I'm curious, what are you using it for? It seems like you are augmenting training data by warping the data with thin-plate-spline model. Was the library useful in this respect, i.e did it improve performance (quantifyable)? I'd like add a linkto your paper/model the project page if you don't mind?

Pre-training on the BL30K dataset after pre-training on static images

As I see that in the pre-training on static images stage, the "single_object" in PropagationNetwork is True, so the MaskRGBEncoderSO is used.
When I try to load the pre-trained of the above stage for the pre-training on the BL30K dataset or Main training, the "single_object" now is False and the model use MaskRGBEncoder instead. After that, the model can not load the model successfully.
Here is the error:
Traceback (most recent call last): File "train.py", line 68, in <module> total_iter = model.load_model(para['load_model']) File "/content/Mask-Propagation/model/model.py", line 180, in load_model self.PNet.module.load_state_dict(network) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1224, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for PropagationNetwork: size mismatch for mask_rgb_encoder.conv1.weight: copying a param with shape torch.Size([64, 4, 7, 7]) from checkpoint, the shape in current model is torch.Size([64, 5, 7, 7]).

So can you explain how can we fix it?
Thank you so much.

some question about codes

您好,在您的代码中我有个疑问想请教一下,如果我对meige每个帧分别保存两个key特征图和两个value特征图,如下:

image

应对PropagationModel中的self.PNet(Fs[:,0], Ms[:,0])也分别对应两个key特征图和两个value特征图?
image
是不是对应在PropagationNetwork的segment中的输入是不是也要对应两个key特征图和两个value特征图?
除外之外,还需要做哪些改进吗?
image

about train stage

Sorry to bother you, I plan to use my own image dataset and video dataset for training, I would like to ask if I can skip the training of stage1, i.e. use the image dataset to train stage0, use the video dataset (I have converted into the DAVIS dataset format) training stage2

Has anyone experienced CUDA out of memory?

Hi! Thank you for sharing this great code. I was wondering on what machine did you use for training and inference. I was using it to infer on my own data, and some of the bigger video sequence yield CUDA out of memory (v100 16gb).
I also tried to load the model to fp16, but I feel like the accuracy was compromised because some folders did not have anything segmented.

Please let me know if there's anything you'd suggest us trying. Thank you very much!

About BL30K

作者您好,我将BL30K的6个压缩包全部下载好,并全部解压之后,在进行第二个阶段的预训练时报错是找不到data/dangjisheng/BL30K/a/BL30K/Annotations/kea03423/00020.png',不知道为什么?我是把6个文件压缩包全部下载好而且全部解压在一个目录下的,为什么会报错缺少文件?期待您的回复。

image

image

license file missing

Is the repo/project have any restrictions for usage.Is it available for commercial use also?

Confusion about Fig.6

I apologize for bothering you.
Is Figure 6 generated by compiling relevant data from multiple video sequences, calculating their mean and interquartile range, and then plotting them? If so, could you please specify which dataset's video sequences you used?
image

only mask propagation?

Hi, your method consists of three core components: interaction-to-mask, mask propagation, and difference-aware fusion. The semi-supervised video target segmentation published by this project only includes Mask Propagation?

how to save the feature map of manymemory frames?

There is a part of your code that I don't understand. Should the memory frame be stored separately, or should the key-value feture map and the content feature map of the memory frame be connected together to save?Which line represents the memory frame saved?

About Fig.6 again

Do you include the background when calculating mIoU?
The results I calculated are somewhat unusual; the decrease in mIoU(-0.07 ~ -0.17) is not as significant as shown in Figure 6(-0.1 ~ -0.4).
image
image

I used the original data you provided, 81.5 and 83.1.
image

About STCN, I used the original data you provided, 83.3(computed with official model without top-k) and 85.3.
image
image

Here is my code:

def calculate_iou(pred_mask, true_mask, class_id):
    pred_class = (pred_mask == class_id)
    true_class = (true_mask == class_id)

    intersection = np.logical_and(pred_class, true_class)
    union = np.logical_or(pred_class, true_class)

    iou = (np.sum(intersection) + 1e-6) / (np.sum(union) + 1e-6)

    return iou


def calculate_miou(pred_dir, gt_dir):
    pred_images = [Image.open(path) for path in pred_dir]
    gt_images = [Image.open(path) for path in gt_dir]

    num_classes = np.max(np.array(pred_images[0])) + 1  # Assuming class labels start from 0

    class_ious = np.zeros(num_classes)

    for class_id in range(num_classes):
        class_iou_sum = 0
        class_pixel_count = 0

        for i in range(len(pred_images)):
            pred_mask = np.array(pred_images[i])
            true_mask = np.array(gt_images[i])

            class_iou = calculate_iou(pred_mask, true_mask, class_id)

            class_iou_sum += class_iou
            class_pixel_count += np.sum(true_mask == class_id)

        class_ious[class_id] = class_iou_sum / len(pred_images)

    mean_iou = np.mean(class_ious)

    return mean_iou, class_ious

What if I didn't use BL30K?

I did not use your pretrainined model and started training your network again. My pre-training only used static pictures instead of BL30K. After the completion of the training, the result of the test on Davis 2017 data set was only 70.6%(J F mean), is this result normal?What would the normal result be if we didn't use BL30K?

image

subprocess.CalledProcessError

Hi, thanks for your great work! When I try to run CUDA_VISIBLE_DEVICES=0 OMP_NUM_THREADS=4 python -m torch.distrib
uted.launch --master_port 9842 --nproc_per_node=2 train.py --id retrain_s0 --stage 0
, I meet this problem, can you help me?

File "/home/longma/anaconda2/envs/p3torchstm/lib/python3.6/site-packages/torch/distributed/launch.py", line 242, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/home/longma/anaconda2/envs/p3torchstm/bin/python', '-u', 'train.py', '--local_rank=1', '--id', 'retrain_s0', '--stage', '0']' returned non-zero exit status 1.

About top-k filtering

I am sorry to bother you.
When I used top-k, I found a problem that my input values might be large sometimes.
After 'values.exp_()', there will be 'INF'.
To avoid overflow, I change the computation as:
image
Is it reasonable to make this change?

RuntimeError: CUDA error: out of memory

How many GPUs do you need to test on Davis and YouTube?I keep reporting memory errors during my tests.I directly used the model trained by static pictures for VOS training, skipping the pre-training of BL30K. Is that OK?

如何取所有内存帧中的前一帧?

您好,有个问题想请教下,
image

这是将所有内存帧和当前帧的内存读取操作,如果说只利用当前帧的前一帧进行匹配的话,如果去内存帧的最后保留的一帧,换句话说如何取前一帧?代码中如何实现?感谢。

about batch size

The default value of your batch_size is 1. If we increase the batch size, will the accuracy be improved? Have you done any relevant experiments?

About BK30K

作者您好,我将BL30K的6个压缩包全部下载好,并全部解压之后,在进行第二个阶段的预训练时报错是找不到data/dangjisheng/BL30K/a/BL30K/Annotations/kea03423/00020.png',不知道为什么?我是把6个文件压缩包全部下载好而且全部解压在一个目录下的,为什么会报错缺少文件?期待您的回复。

image

image

How to install thinplate manually?

(mivos2) dangjisheng@ubuntui:/data/dangjisheng$ pip install git+git://github.com/cheind/py-thin-plate-spline
Collecting git+git://github.com/cheind/py-thin-plate-spline
Cloning git://github.com/cheind/py-thin-plate-spline to /tmp/pip-g3zq_wu4-build
error: Couldn't set refs/heads/master
fatal: update_ref failed for ref 'HEAD':

How to install thinplate manually?I changed a computer and I didn't install thinplateon with the commands you provided.
Thank you very much for your help. I would like to reproduce your code as soon as possible and start some work based on your work.

Backbone uses Resnet50?

Backbone uses Resnet50?Will the precision be improved if Backbone is replaced with Resnet101?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.