poppinace / indexnet_matting Goto Github PK

View Code? Open in Web Editor NEW

388.0 388.0 91.0 43.01 MB

(ICCV'19) Indices Matter: Learning to Index for Deep Image Matting

License: Other

MATLAB 2.77% Python 96.94% Shell 0.29%

indexnet_matting's People

Contributors

Stargazers

Watchers

Forkers

tavihalperin chenquan-cq templeblock yuhuan90 diliu1992 azuredsky ray-mami shinexunju hufangjian csf0429 banyueqin bboyhanat yangtong1989 dineshrs22 wanglc2008 trendingtechnology human2b liminn undercontroller wovai greitzmann gowithwind tkianai liuguoyou kp-forks kelisiya adelaide-ai-group hubble-bubble dongdong93 myelinfoundry-2019 redaihanyu zhyj3038 983 duke24k saketm2408 senliuy mdp0007 mifk bala93 pinglmlcv x-lai kennyj kirilcvetkov92 73-ch aodamiaomiao killyseason onestep00 c00renut cv-ip komavideo sinfere thorpham aleksandervainer miracle-fmh windson87 zmatting zhaohanlin eridgd hopeliu20160622 tamwaiban ericustc shiyongde jiawangbian youngster99 medical-projects ngc55 xho-yhzhou climmy pandinosaurus xiongzhu666 yueyedeai shengcn datura0822 greatwallgoogle xincoder xgmiao guoz2013 mowshon kingxt mayokunisaac ryosukesuzukii jinchaogjc imasug kzx2018 wangruohui westnight zxx24 zjw-hust zenosai xyshao23

indexnet_matting's Issues

Inquiry about the supplementary material

Hi,

Great work and I am looking forward to the release of training code. By the way, would you like to share the supplementary material of the original paper in this Github? I am really interested in the performance of other visual tasks using IndexNet as announced in Section 5.4.

Regards,
Mingfu

How to solve this error? When I run this net to predict an image，I have GPU，But I want not to use it.

RuntimeError: module must have its parameters and buffers on device cuda:0 (device_ids[0]) but found one of them on device: cpu
Traceback:
File "/data/User/ch/CNN/view/image_app.py", line 80, in matting_view
alpha = matting.predict(image, trimap)
File "/root/anaconda3/lib/python3.7/site-packages/graphics/function/Matting.py", line 81, in predict
outputs = self.net(inputs).squeeze().cpu().data.numpy()
File "/root/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/root/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 146, in forward
"them on device: {}".format(self.src_device_obj, t.device))

Rare inconsistency for background image size during data loading

To generate the synthetic dataset for alpha matting, foreground images are composited onto background images. This will only work if the background image is at least as large as the foreground image.

I noticed a small inconsistency between how backgrounds are resized in Composition_code.py and hldataset.py. The result is that sometimes the background image has a different scale than the background in the composited image, for example in training sample 35787.

Composite image (denormalized)

Background image

The reason is that Composition_code.py will only resize the background if it is smaller than the foreground, while hldataset.py will always resize the background to the size of the foreground image, even if it is large enough and would not need to be resized.

This is a rare occurrence because the foreground images are usually larger than the background images, so it probably does not matter much.

about the evaluation code

@poppinace
hi
Thank you for your great work
You mentioned you have implemented a python version of the evaluation_code. But I can't find it in the folder. Would you mind to tell me where is it?

Correct the relative path and switch to the CPU.

Question:
when I run demo.py

indexnet_matting/scripts/demo.py

Line 45 in 867720d

checkpoint = torch.load(RESTORE_FROM)

Error will be reported when loading the model.

Reason
The relative path of pretrained and example is wrong.

Questions about B11 model

Thank you for your great work!

I have two questions regarding your B11 model (Unet).
(1) It seems that your code uses both conv and pooling for downsampling, which maybe a typo? So which downsampling module do you use?
(2) Is the crop size 320 or 321 in your training of B11?

GPU memory increase during DIM 1K-dataset testing

Hi Hao,

I run the testing code with both DIM pretrained model and indexnet matting pretrained model. The GPU I used is 2080Ti and the PyTorch version is 1.0.

During testing with indexnet pretrained model, I observe that the GPU memory keeps increasing from 5000+ MB to 10200+ MB while it only takes 2680 MB to 4000+ MB approximately for DIM pretrained model at the first 700+ iters but also suddenly increase to 10800+ MB at 800+/1000 iters. For the testing time, indexnet matting (avg: 5.88Hz) is much faster than DIM model (avg: 10.75Hz).

It seems like you use the original size of DIM image for inference and is it normal to see increasing memory usage during inference mode for both DIM pretrained model and IndexNet Matting?

By the way, the DIM pretrained model you provided seems not to be consistent with the evaluation score you provide in the GitHub, here is what I get at the last iter:

test: 1000/1000, sad: 14.25, SAD: 59.47, MSE: 0.0205, framerate: 10.11Hz/10.75Hz

For IndexNet Matting I get:

test: 1000/1000, sad: 11.49, SAD: 45.65, MSE: 0.0131, framerate: 4.98Hz/5.88Hz

which seems to match the results you report in your paper.

Regards,
Mingfu

why fix encoder bn

hello， a greate work~~~ I have a question why you fix encoder bn when training ? have you do some comparion

Number of parameters inconsistent with the paper?

Hello! Thanks for your sharing this awesome repository!
When I try your get_model_summary function in the demo file, with the default model (m2o DIN nonlinear+context) I find the parameter size is 5,953,515.
According to your paper, the parameters should be about 8.15M, right?
And I notice that when setting use_nonlinear=False, the parameters and GFLOPs are the same with the data on your paper (with or without context).
Do you have a clue about this discrepancy?

Here's what the summary function returns:
Total Parameters: 5,953,515

Total Multiply Adds (For Convolution and Linear Layers only): 5.189814209938049 GFLOPs

Number of Layers
Conv2d : 109 layers BatchNorm2d : 88 layers ReLU6 : 71 layers DepthwiseM2OIndexBlock : 5 layers InvertedResidual : 17 layers _ASPPModule : 4 layers AdaptiveAvgPool2d : 1 layers Dropout : 1 layers ASPP : 1 layers IndexedUpsamlping : 7 layers

errors when use model.train()

Traceback (most recent call last):
File "/Applications/PyCharm CE.app/Contents/helpers/pydev/pydevd.py", line 1741, in
main()
File "/Applications/PyCharm CE.app/Contents/helpers/pydev/pydevd.py", line 1735, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/Applications/PyCharm CE.app/Contents/helpers/pydev/pydevd.py", line 1135, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/Applications/PyCharm CE.app/Contents/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/Users/CNN/IndexNet-master/algorithm/demo.py", line 36, in
detector.train()
File "/Users/CNN/IndexNet-master/algorithm/train/train.py", line 100, in train
outputs = self.net(image).squeeze().cpu().data.numpy()
File "/Users/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/Users/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 140, in forward
return self.module(*inputs, **kwargs)
File "/Users/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/Users/CNN/IndexNet-master/algorithm/network/hlmobilenetv2.py", line 1134, in forward
l = self.dconv_pp(l7) # 160x10x10
File "/Users/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/Users/CNN/IndexNet-master/algorithm/network/hlaspp.py", line 139, in forward
x5 = self.global_avg_pool(x)
File "/Users/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/Users/anaconda3/lib/python3.7/site-packages/torch/nn/modules/container.py", line 92, in forward
input = module(input)
File "/Users/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/Users/anaconda3/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 83, in forward
exponential_average_factor, self.eps)
File "/Users/anaconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 1693, in batch_norm
raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size))
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])

train on multi gpu

How to train on multi gpu? thanks

In terms of speed of reasoning, whether you can achieve real time

License question

Thank you for this amazing repository!

I tried to find out the license, but I am confused because there are several different ones.

The top-level license is MIT: https://github.com/poppinace/indexnet_matting/blob/master/LICENSE.md
The license of most code is a modification of the 2-Clause BSD license which disallows all use except academic use:

indexnet_matting/scripts/hlmobilenetv2.py

Line 11 in 867720d

This software is strictly limited to academic purposes only
There is the Adobe license: https://github.com/poppinace/indexnet_matting/blob/master/Adobe%20Deep%20Image%20Mattng%20Dataset%20License%20Agreement.pdf

So my question is: What is the license of the code?

some error use connectivity loss fuction in gpu

Im trying to reproduce your paper . The cnn retun a tensor , but the Loss Fuction use numpy;
So I use tensor to numpy and calculate loss , the numpy to tensor . I found connectivity loss can't work.
How did you deal with it?

Automatic Background Removal technology

I am looking for a deep learning library/sdk which can be used to remove the background from any image automatically (with quality as good as www.remove.bg).

I tried some image segmentation SDKs with pre-trained models such as Tensorflow Lite & Fritz AI, but the accuracy of the cutout mask was very low, amongst other issues.

Criteria :-

Background Removal rather than just Human/Portrait Segmentation

If the foreground consists of person holding a balloon, sitting on a chair, with a pet on his side, then I want all of this to get extracted. Not just the human cutout. The segmentation SDKs I tried are only extracting humans (the chair gets vanished), that too with a very low quality mask (hair gets cut, parts of ear gets cut, etc).

Mask quality should be Super-Accurate

I want even the finer details like the hair, delicate clothes, etc to be extracted perfectly.

Fast & Lightweight (for mobile phone)

I want to use this technology on mobile phones (in an Android app) which should ideally work even in an offline environment. If this option is difficult to achieve, then plan B would be install the technoloy on our server.

Technology

What technology should I be exploring to achieve this? Is it called image segmentation or the better term would be image matting? (e.g. http://alphamatting.com/eval_25.php)

I have been reading a lot and I am currently lost in the sea of various technologies out there (OpenCV, Deep Matting, Mask RCNN, Instance Segmentation, Detectron2, Tensorflow, Pytorch, etc). I wonder what magic is happening behind the curtains of www.remove.bg

Would your library help me to achieve what I am looking for? Any help you could provide or a nudge in the right direction would be awesome.

Thanks a ton!

Reproducing results

I tried to reproduce the results on the Adobe 1k Dataset and got exactly the same numbers when using the pretrained model. Very good job with that :)

I also tried to train the model from scratch, but did not succeed yet. Do you have any tips?

What I got so far:

What it should look like:

As you can see, your model produces much sharper results.

My training procedure:

Find all x/y coordinate pairs which have alpha values between 0 and 1.
Pick a random x/y pair
Crop a 320x320, 480x480 or 640x640 foreground/alpha image centered on that pixel pair
Resize them to 320x320
Generate randomly dilated trimap from alpha using distance fields
Choose a random background image and resize it to 320x320
Blend foreground and background image
Train network with Adam optimizer, learning rate 0.01 and otherwise default parameters for 90000 batches of size 16, decaying the learning rate by a factor of 10 at the 60000th and 78000th batch
L1 loss on alpha and compositional L1 loss, both on unknown image region only

Model:

net = hlmobilenetv2(
        pretrained=True,
        freeze_bn=True, 
        output_stride=32,
        apply_aspp=True,
        conv_operator='std_conv',
        decoder='indexnet',
        decoder_kernel_size=5,
        indexnet='depthwise',
        index_mode='m2o',
        use_nonlinear=True,
        use_context=True
)

I've also tried:

training for 200000 batches instead of just 90000
L2 loss instead of L1 loss
only alpha loss
only compositional loss
pretrained=False
freeze_bn=True

I am not sure about first cropping and then resizing, as described in Deep Image Matting, because every batch it produces a few trimaps which have 100% unknown region. Also, it is impossible to crop a 640x640 image from some alpha mattes because they don't have unknown pixels to center the cropped region on.

How can I train the DIM?

Hello, thank you for your great job. I wonder how can I train DIM, since I can only find the train.sh for IndexNet Matting. Can you provide me with the sh file for DIM? Thank you.

freeze_bn seems to be an invalid option

Dear author,

I am trying to read and reproduce your codes, but I found some possible issue with batch normalization.

In current code, you define a freeze_bn() function to change all batch normalization layers to eval mode, like

indexnet_matting/scripts/hlmobilenetv2.py

Line 824 in 4beb06a

self.freeze_bn()

But you neither rewrite the train function of nn.Module nor call this function every time before the training cycle.

This means when train() function call net.train(), these BN layers becomes training mode again and freeze_bn actually takes no effect and all training is conducted under with BN enabled. Is this right?

error in loading pre-trained model to train on multi-gpu

I am trying to load the pretrained model to fine tune on multi-gpu setting. However, I am getting an error message . Here is my code:

        checkpoint = torch.load(args.restore_from)
        pretrained_dict = OrderedDict()
        for key, value in checkpoint['state_dict'].items():
            if 'module' in key:
                key = key[7:]
            pretrained_dict[key] = value
        net.load_state_dict(pretrained_dict)

The error message:

RuntimeError: Error(s) in loading state_dict for hlMobileNetV2UNetDecoderIndexLearning:
Missing key(s) in state_dict: "layer0.1._tmp_running_mean", "layer0.1._tmp_running_var", "layer0.1._running_iter", "layer1.0.conv.1._tmp_running_mean", "layer1.0.conv.1._tmp_running_var", "layer1.0.conv.1._running_iter", "layer1.0.conv.4._tmp_running_mean", "layer1.0.conv.4._tmp_running_var", "layer1.0.conv.4._running_iter", "layer2.0.conv.1._tmp_running_mean", "layer2.0.conv.1._tmp_running_var", "layer2.0.conv.1._running_iter", "layer2.0.conv.4._tmp_running_mean", "layer2.0.conv.4._tmp_running_var", "layer2.0.conv.4._running_iter", "layer2.0.conv.7._tmp_running_mean", "layer2.0.conv.7._tmp_running_var", "layer2.0.conv.7._running_iter", "layer2.1.conv.1._tmp_running_mean", "layer2.1.conv.1._tmp_running_var", "layer2.1.conv.1._running_iter", "layer2.1.conv.4._tmp_running_mean", "layer2.1.conv.4._tmp_running_var", "layer2.1.conv.4._running_iter", "layer2.1.conv.7._tmp_running_mean", "layer2.1.conv.7._tmp_running_var", "layer2.1.conv.7._running_iter", "layer3.0.conv.1._tmp_running_mean", "layer3.0.conv.1._tmp_running_var", "layer3.0.conv.1._running_iter", "layer3.0.conv.4._tmp_running_mean", "layer3.0.conv.4._tmp_running_var", "layer3.0.conv.4._running_iter", "layer3.0.conv.7._tmp_running_mean", "layer3.0.conv.7._tmp_running_var", "layer3.0.conv.7._running_iter", "layer3.1.conv.1._tmp_running_mean", "layer3.1.conv.1._tmp_running_var", "layer3.1.conv.1._running_iter", "layer3.1.conv.4._tmp_running_mean", "layer3.1.conv.4._tmp_running_var", "layer3.1.conv.4._running_iter", "layer3.1.conv.7._tmp_running_mean", "layer3.1.conv.7._tmp_running_var", "layer3.1.conv.7._running_iter", "layer3.2.conv.1._tmp_running_mean", "layer3.2.conv.1._tmp_running_var", "layer3.2.conv.1._running_iter", "layer3.2.conv.4._tmp_running_mean", "layer3.2.conv.4._tmp_running_var", "layer3.2.conv.4._running_iter", "layer3.2.conv.7._tmp_running_mean", "layer3.2.conv.7._tmp_running_var", "layer3.2.conv.7._running_iter", "layer4.0.conv.1._tmp_running_mean", "layer4.0.conv.1._tmp_running_var", "layer4.0.conv.1._running_iter", "layer4.0.conv.4._tmp_running_mean", "layer4.0.conv.4._tmp_running_var", "layer4.0.conv.4._running_iter", "layer4.0.conv.7._tmp_running_mean", "layer4.0.conv.7._tmp_running_var", "layer4.0.conv.7._running_iter", "layer4.1.conv.1._tmp_running_mean", "layer4.1.conv.1._tmp_running_var", "layer4.1.conv.1._running_iter", "layer4.1.conv.4._tmp_running_mean", "layer4.1.conv.4._tmp_running_var", "layer4.1.conv.4._running_iter", "layer4.1.conv.7._tmp_running_mean", "layer4.1.conv.7._tmp_running_var", "layer4.1.conv.7._running_iter", "layer4.2.conv.1._tmp_running_mean", "layer4.2.conv.1._tmp_running_var", "layer4.2.conv.1._running_iter", "layer4.2.conv.4._tmp_running_mean", "layer4.2.conv.4._tmp_running_var", "layer4.2.conv.4._running_iter", "layer4.2.conv.7._tmp_running_mean", "layer4.2.conv.7._tmp_running_var", "layer4.2.conv.7._running_iter", "layer4.3.conv.1._tmp_running_mean", "layer4.3.conv.1._tmp_running_var", "layer4.3.conv.1._running_iter", "layer4.3.conv.4._tmp_running_mean", "layer4.3.conv.4._tmp_running_var", "layer4.3.conv.4._running_iter", "layer4.3.conv.7._tmp_running_mean", "layer4.3.conv.7._tmp_running_var", "layer4.3.conv.7._running_iter", "layer5.0.conv.1._tmp_running_mean", "layer5.0.conv.1._tmp_running_var", "layer5.0.conv.1._running_iter", "layer5.0.conv.4._tmp_running_mean", "layer5.0.conv.4._tmp_running_var", "layer5.0.conv.4._running_iter", "layer5.0.conv.7._tmp_running_mean", "layer5.0.conv.7._tmp_running_var", "layer5.0.conv.7._running_iter", "layer5.1.conv.1._tmp_running_mean", "layer5.1.conv.1._tmp_running_var", "layer5.1.conv.1._running_iter", "layer5.1.conv.4._tmp_running_mean", "layer5.1.conv.4._tmp_running_var", "layer5.1.conv.4._running_iter", "layer5.1.conv.7._tmp_running_mean", "layer5.1.conv.7._tmp_running_var", "layer5.1.conv.7._running_iter", "layer5.2.conv.1._tmp_running_mean", "layer5.2.conv.1._tmp_running_var", "layer5.2.conv.1._running_iter", "layer5.2.conv.4._tmp_running_mean", "layer5.2.conv.4._tmp_running_var", "layer5.2.conv.4._running_iter", "layer5.2.conv.7._tmp_running_mean", "layer5.2.conv.7._tmp_running_var", "layer5.2.conv.7._running_iter", "layer6.0.conv.1._tmp_running_mean", "layer6.0.conv.1._tmp_running_var", "layer6.0.conv.1._running_iter", "layer6.0.conv.4._tmp_running_mean", "layer6.0.conv.4._tmp_running_var", "layer6.0.conv.4._running_iter", "layer6.0.conv.7._tmp_running_mean", "layer6.0.conv.7._tmp_running_var", "layer6.0.conv.7._running_iter", "layer6.1.conv.1._tmp_running_mean", "layer6.1.conv.1._tmp_running_var", "layer6.1.conv.1._running_iter", "layer6.1.conv.4._tmp_running_mean", "layer6.1.conv.4._tmp_running_var", "layer6.1.conv.4._running_iter", "layer6.1.conv.7._tmp_running_mean", "layer6.1.conv.7._tmp_running_var", "layer6.1.conv.7._running_iter", "layer6.2.conv.1._tmp_running_mean", "layer6.2.conv.1._tmp_running_var", "layer6.2.conv.1._running_iter", "layer6.2.conv.4._tmp_running_mean", "layer6.2.conv.4._tmp_running_var", "layer6.2.conv.4._running_iter", "layer6.2.conv.7._tmp_running_mean", "layer6.2.conv.7._tmp_running_var", "layer6.2.conv.7._running_iter", "layer7.0.conv.1._tmp_running_mean", "layer7.0.conv.1._tmp_running_var", "layer7.0.conv.1._running_iter", "layer7.0.conv.4._tmp_running_mean", "layer7.0.conv.4._tmp_running_var", "layer7.0.conv.4._running_iter", "layer7.0.conv.7._tmp_running_mean", "layer7.0.conv.7._tmp_running_var", "layer7.0.conv.7._running_iter", "index0.indexnet1.1._tmp_running_mean", "index0.indexnet1.1._tmp_running_var", "index0.indexnet1.1._running_iter", "index0.indexnet2.1._tmp_running_mean", "index0.indexnet2.1._tmp_running_var", "index0.indexnet2.1._running_iter", "index0.indexnet3.1._tmp_running_mean", "index0.indexnet3.1._tmp_running_var", "index0.indexnet3.1._running_iter", "index0.indexnet4.1._tmp_running_mean", "index0.indexnet4.1._tmp_running_var", "index0.indexnet4.1._running_iter", "index2.indexnet1.1._tmp_running_mean", "index2.indexnet1.1._tmp_running_var", "index2.indexnet1.1._running_iter", "index2.indexnet2.1._tmp_running_mean", "index2.indexnet2.1._tmp_running_var", "index2.indexnet2.1._running_iter", "index2.indexnet3.1._tmp_running_mean", "index2.indexnet3.1._tmp_running_var", "index2.indexnet3.1._running_iter", "index2.indexnet4.1._tmp_running_mean", "index2.indexnet4.1._tmp_running_var", "index2.indexnet4.1._running_iter", "index3.indexnet1.1._tmp_running_mean", "index3.indexnet1.1._tmp_running_var", "index3.indexnet1.1._running_iter", "index3.indexnet2.1._tmp_running_mean", "index3.indexnet2.1._tmp_running_var", "index3.indexnet2.1._running_iter", "index3.indexnet3.1._tmp_running_mean", "index3.indexnet3.1._tmp_running_var", "index3.indexnet3.1._running_iter", "index3.indexnet4.1._tmp_running_mean", "index3.indexnet4.1._tmp_running_var", "index3.indexnet4.1._running_iter", "index4.indexnet1.1._tmp_running_mean", "index4.indexnet1.1._tmp_running_var", "index4.indexnet1.1._running_iter", "index4.indexnet2.1._tmp_running_mean", "index4.indexnet2.1._tmp_running_var", "index4.indexnet2.1._running_iter", "index4.indexnet3.1._tmp_running_mean", "index4.indexnet3.1._tmp_running_var", "index4.indexnet3.1._running_iter", "index4.indexnet4.1._tmp_running_mean", "index4.indexnet4.1._tmp_running_var", "index4.indexnet4.1._running_iter", "index6.indexnet1.1._tmp_running_mean", "index6.indexnet1.1._tmp_running_var", "index6.indexnet1.1._running_iter", "index6.indexnet2.1._tmp_running_mean", "index6.indexnet2.1._tmp_running_var", "index6.indexnet2.1._running_iter", "index6.indexnet3.1._tmp_running_mean", "index6.indexnet3.1._tmp_running_var", "index6.indexnet3.1._running_iter", "index6.indexnet4.1._tmp_running_mean", "index6.indexnet4.1._tmp_running_var", "index6.indexnet4.1._running_iter", "dconv_pp.aspp1.atrous_conv.1._tmp_running_mean", "dconv_pp.aspp1.atrous_conv.1._tmp_running_var", "dconv_pp.aspp1.atrous_conv.1._running_iter", "dconv_pp.aspp2.atrous_conv.1._tmp_running_mean", "dconv_pp.aspp2.atrous_conv.1._tmp_running_var", "dconv_pp.aspp2.atrous_conv.1._running_iter", "dconv_pp.aspp2.atrous_conv.4._tmp_running_mean", "dconv_pp.aspp2.atrous_conv.4._tmp_running_var", "dconv_pp.aspp2.atrous_conv.4._running_iter", "dconv_pp.aspp3.atrous_conv.1._tmp_running_mean", "dconv_pp.aspp3.atrous_conv.1._tmp_running_var", "dconv_pp.aspp3.atrous_conv.1._running_iter", "dconv_pp.aspp3.atrous_conv.4._tmp_running_mean", "dconv_pp.aspp3.atrous_conv.4._tmp_running_var", "dconv_pp.aspp3.atrous_conv.4._running_iter", "dconv_pp.aspp4.atrous_conv.1._tmp_running_mean", "dconv_pp.aspp4.atrous_conv.1._tmp_running_var", "dconv_pp.aspp4.atrous_conv.1._running_iter", "dconv_pp.aspp4.atrous_conv.4._tmp_running_mean", "dconv_pp.aspp4.atrous_conv.4._tmp_running_var", "dconv_pp.aspp4.atrous_conv.4._running_iter", "dconv_pp.global_avg_pool.2._tmp_running_mean", "dconv_pp.global_avg_pool.2._tmp_running_var", "dconv_pp.global_avg_pool.2._running_iter", "dconv_pp.bottleneck_conv.1._tmp_running_mean", "dconv_pp.bottleneck_conv.1._tmp_running_var", "dconv_pp.bottleneck_conv.1._running_iter", "decoder_layer6.dconv.1._tmp_running_mean", "decoder_layer6.dconv.1._tmp_running_var", "decoder_layer6.dconv.1._running_iter", "decoder_layer5.dconv.1._tmp_running_mean", "decoder_layer5.dconv.1._tmp_running_var", "decoder_layer5.dconv.1._running_iter", "decoder_layer4.dconv.1._tmp_running_mean", "decoder_layer4.dconv.1._tmp_running_var", "decoder_layer4.dconv.1._running_iter", "decoder_layer3.dconv.1._tmp_running_mean", "decoder_layer3.dconv.1._tmp_running_var", "decoder_layer3.dconv.1._running_iter", "decoder_layer2.dconv.1._tmp_running_mean", "decoder_layer2.dconv.1._tmp_running_var", "decoder_layer2.dconv.1._running_iter", "decoder_layer1.dconv.1._tmp_running_mean", "decoder_layer1.dconv.1._tmp_running_var", "decoder_layer1.dconv.1._running_iter", "decoder_layer0.dconv.1._tmp_running_mean", "decoder_layer0.dconv.1._tmp_running_var", "decoder_layer0.dconv.1._running_iter", "pred.0.1._tmp_running_mean", "pred.0.1._tmp_running_var", "pred.0.1._running_iter".

Any idea on how to resolve this? Thanks

Possible reasons for different performance between DIM Re-implementation and original implementation

Thanks for your great work!
When I'm trying to reproduce the result of DIM(Deep Image Matting), I found that using vgg model without BN and setting batch-size to 1 will give the same or even better performance, comparing to the results reported by the original DIM paper.

So I think that fewer training data is not supposed to be the reason for different performance.

Is the validate in training necessary?

the validate is too slow, its batch size is 1. If the size of test set is very large, it will take a lot of time.

The file pretrained/indexnet_matting.pth.tar is corrupted

The file indexnet_matting.pth.tar downloaded from github is not correct which means I cannot unzip the tar file. Could you kindly reupdate your pretrained model of the indexnet as the pth.tar file is corrupted right now?

training from scratch

Thanks for great work~
I am trying to training your model from scratch(only use pretrained mobilenet weights).
I encountered two problems:

The first problem is shown by the red arrow: the alpha value of the unkonwn region is not large enough.
The second problem is shown by the blue arrow: outside the unknown region, there are always white scattered dots.

For the first problem, I think my training epochs is not enough (just trained to the 6th epoch).
For the second problem, I am very confused and have no ideas.
Do you have any suggestions on these two problems?

Training on custom data set

Does this work with custom binary masks as ground truth, not alpha matte?
How can I remove random cropping and train on the whole image instead of cropped patches?

what do i have to do ? i see first trimaps are needed, do you have a complete matting project that generates trimaps then runs training ?

or how do i first generate these trimaps ?
Thanks