miguelvr / dropblock Goto Github PK

Implementation of DropBlock: A regularization method for convolutional networks in PyTorch.

License: MIT License

Python 100.00%

machine-learning computer-vision pytorch regularization convolutional-neural-networks dropblock pytorch-implementation dropout

dropblock's People

Contributors

Stargazers

Watchers

Forkers

jacke121 sjaek deftruth mengrang radao firstandsecond wanqiuwang junyue ieyer zrh0712 wyj2046 wh-forker mentorezio gy2066 jdc08161063 haochange transcendentsky modricwang yangyuren03 tanxchong mdraw dl-85 ii0 phamcuong92 xiaolonghao wenqish ruthcfong itisgrisha faizwhb lianxxx yoookoo tianbaochou giorking yan-song lcaikk1314 hsj307 jamencetom robot-ai-machinelearning wang21jun yukaizhou blue-blue272 shamanez zyweven lzh990711 suyanzhou626 sailfish009 songkq fdsjk qingsong99 chenyanghungry xialuxi qianrenjian 2429581027 shengzhang90 wqz960 youtang1993 littlepigzai zoukunlin hanochk nessessence panyongqi abhishekaich27 jackieee mc261670164 styjb ywmsama songya haolyshiit hoangphucitjp alec-roberson zfbi kevingmj1127 mymuli jhsong04 mygit007hub yunwuhen dumpmemory holliemin9090 eliza-and-black pdoublerainbow ericzhang1994 ml-edu womeichibao messileo1 cryptowealth-technology xiaodongdreams kalok87 bbgcz ahmad-abdellatif iq-scm khoanta-ai shahjahan0275

dropblock's Issues

scheduled dropblock?

Hi,

did you implement scheduled dropblock as mentioned in the paper?

In our experiments, we use a linear scheme of decreasing the value ofkeep_prob, which tends to work well across many hyperparameter settings. This linear scheme issimilar to ScheduledDropPath

Pre-trained ResNet50 model on ImageNet?

Do you have pre-trained model released?
Or we just change the dropout to dropblock in pre-trained ResNet 50 is ok?

The examples for using dropblock is different?

In this part, the setting for drop_prob in DropBlock2D(block_size = 3, drop_prob = 0.) is "0.", and the stop_value is set to "0.25". But in the example in resnet-cifar10.py(https://github.com/miguelvr/dropblock/tree/master/examples), the setting for drop_prob in DropBlock2D is shown as follows:

The parameters drop_prob and stop_value both take the idential setting by using "drop_prob", but in the first example, the setting for drop_prob and stop_value are 0. and 0.25 seperately. Which one is right? Thank you very much.

Negative block mask egde case

Description:

When the Bernoulli distribution samples two ones within their own block size range, the block mask gets a negative value.

Example:

    db = DropBlock2D(block_size=2, drop_prob=0.1)
    mask = torch.tensor([[[1., 0., 0., 0., 0.],
                          [0., 1., 0., 0., 0.],
                          [0., 0., 0., 0., 0.],
                          [0., 0., 0., 0., 0.],
                          [0., 0., 0., 0., 0.]]])

    block_mask = db._compute_block_mask(mask)

block_mask value:

tensor([[[ 1.,  1.,  1.,  1.,  1.,  1.],
         [ 1.,  0.,  0.,  1.,  1.,  1.],
         [ 1.,  0., -1.,  0.,  1.,  1.],
         [ 1.,  1.,  0.,  0.,  1.,  1.],
         [ 1.,  1.,  1.,  1.,  1.,  1.],
         [ 1.,  1.,  1.,  1.,  1.,  1.]]])

expected result:

tensor([[[ 1.,  1.,  1.,  1.,  1.,  1.],
         [ 1.,  0.,  0.,  1.,  1.,  1.],
         [ 1.,  0.,  0.,  0.,  1.,  1.],
         [ 1.,  1.,  0.,  0.,  1.,  1.],
         [ 1.,  1.,  1.,  1.,  1.,  1.],
         [ 1.,  1.,  1.,  1.,  1.,  1.]]])

How to put in evaluation mode?

Doing model.eval() does not shut off dropblock. How can I put it in evaluation mode for testing?

Thanks

resnet-cifar10.py DropBlock scheduler

In the benchmark there is

scheduled dropblock with block_size=5 and increasing drop_prob from 0.0 to 0.25 over 5000 iterations

At line https://github.com/miguelvr/dropblock/blob/master/examples/resnet-cifar10.py#L30 should it be
stop_value=0.25,
?
Because currently in the code start_value=0 and stop_value=drop_prob, where drop_prob equals 0.0 (according to config.yml)). So DropBlock probability will be zero all the time, right?

make dropblock run faster

dropblock/dropblock/dropblock.py

Line 46 in 0ecbb63

mask = (torch.rand(x.shape[0], *x.shape[2:]) < gamma).float()

Modify this line to " mask = (torch.rand(x.shape[0], *x.shape[2:]).to(x.device) < gamma).float() " can be dropblock run faster.

the drop_size cannot be higher than 5?

dropblock/dropblock/dropblock.py

Line 78 in 04b759f

padding=int(np.ceil(self.block_size // 2) + 1))

the implement of computer block mask using conv1d is to handle with the overlapped? and the output of _computer_block_mask must have the same size with x, right?

but when I try block_size=7, the output of block_block_mask is [N, 1, 1]. and when I try block_size=9, I will get an error.

In the function of computer block mask, after conv2d, the size of height and weight is
mask_size + 2*(block_size//2+1) - block_size + 1,
the height and weight of input x is mask_size + block_size//2.
and the former must be larger than the latter, so the (block_size//2) connot higher than 3?

Is it needed to somehow turn off Dropblock layer during inference time?

Or does it turn off by default?

Should the output be scaled by 1/(1-p) during training time?

To ensure the output in the same scale when training and testing, should the output be scaled by 1/(1-p) during training time?

Where to put the dropblock in the ResNet network?

The description in the paper said "We found that applying DropbBlock
in skip connections in addition to the convolution layers increases the accuracy." but in the example file "resnet-cifar10.py" provided in this repo, the places for plugging DropBlock is different, thank you so much for your help.

RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.FloatTensor for argument #2 'other'

File "/opt/conda/lib/python3.6/site-packages/dropblock/dropblock.py", line 140, in forward
out = x * block_mask[:, None, :, :, :]

Encountered when using DropBlock3D.

shared DropBlock mask across different feature channels

Hi, how could i modifiy the code so that each feature channel can have its independent DropBlock mask?
Thanks a lot!

Convolution operation can be replaced by MaxPool

Description:

A maxpool operation can be used for the block mask calculation and might be more efficient to compute than a convolution.

Pytorch Lightning as a back-end

Hi! check out Pytorch Lightning as an option for your backend! We're looking for awesome project implemented in Lightning.

Your project will be really easy to maintain on Lightning!

https://github.com/williamFalcon/pytorch-lightning

Segmentation?

Can Dropblock be used for image segmentation tasks?

The variable x will be assigned to int type wrongly

dropblock/dropblock/dropblock.py

Line 50 in 0a1f2ab

if any([x <= 0 for x in mask_sizes]):

About padding

hello，different padding strategy in DropBlock2D and DropBlock3D，something wrong？ "padding=int(np.ceil(self.block_size / 2) + 1))" "padding=int(np.ceil(self.block_size // 2) + 1))"

Replace Bernoulli with creating random matrix

When I implemented my version of dropblock , I found that Bernoulli sampling can be extremely slow. Therefore I recommend replacing Bernoulli with random matrix.
An implementation may be like that:
mask = (torch.rand(x.shape[0], *mask_sizes)<gamma).float()

why drop_prob in config.yaml is 0

in the example of cifar10, the drop_prob in config.yaml is 0. instead of 0.25, is that right?

Use in segmentation. You suggest only for convolution feature extraction layer only. What do you mean?

When you say to only add it in the convolution feature extraction layer only. I just want to make sure I understand correctly what you meant. Did you mean (if I take the U-Net as an example):

Single layer at the end of the encoder (that makes the hidden feature representation).
All the layer of the encoder of the segmentation model (responsible to extract feature from the image)
Layers of the encoder and decoder of the segmentation model (all hidden layers) (except the last one responsible for the prediction).

Thank you very much,

Originally posted by @Eric2Hamel in #18 (comment)

there is something wrong

dropout=DropBlock2D(0.2,block_size=3)
input=torch.randn((1,1,8,8))
output=dropout(input)

49 block_mask = self._compute_block_mask(mask)
50 # apply block mask
---> 51 out = x * block_mask[:, None, :, :]
52
53 # scale output

RuntimeError: The size of tensor a (8) must match the size of tensor b (9) at non-singleton dimension 3

DropBlock at Residual Connections

Hello, in the paper, they say that they also apply DropBlock to residual connections, I am searching for an example of this to not make a mistake. Can you please give an example how exactly that should be done?

How about the speed of dropblock?

Awesome work!!! Have you test the speed of your implementation of dropblock?

'ResNetCustom' object has no attribute '_norm_layer'

Thank you for your excellent job about the dropblock. However, there are some errors when I run the code. It occured in the 34 line of resnet-cifar10.py, self.layer1 = self._make_layer(block, 64, layers[0]). I am puzzled about it.

Dropblock and python version

It seems that there is no version for python 3.8

Can I use with torch.no_grad() in the forward of dropblock.py?

As the title.
Can someone help me answer it?
thanks!

same results as traditional dropout if block_size=1

Thanks for the code!
I was wondering if you get the same results as traditional dropout when block_size=1.
Based on my experiments, using F.dropout2d for traditional dropout, I cannot confirm this.

torch.jit support

All network layers in this repo can be traced with torch.jit.trace(), but their control flow won't work correctly in traced modules (e.g. training/eval mode is not respected and the check in

dropblock/dropblock/scheduler.py

Line 16 in 16a518a

if self.i < len(self.drop_values):

will be ignored if it's true during the tracing process), unlike PyTorch's native Dropout implementations. I've tried to fix this by rewriting the control flow explicitly in TorchScript, but I didn't get it to work yet.

Since tracing this code does not necessarily emit warnings (see example below), I think this incompatibility should be documented here to make sure no one mistakenly trains traced networks. OTOH networks that are traced in eval mode after a complete training should work as intended, as long as they are not put in training mode again.

Example code that produces a wrong TracedModule without any warning (using PyTorch 1.0.0):

import torch
from dropblock import DropBlock2D, LinearScheduler

drop_block = LinearScheduler(
    DropBlock2D(block_size=3, drop_prob=0.),
    start_value=0.,
    stop_value=0.25,
    nr_steps=5
)

x = torch.randn(1, 1, 8, 8)
traced = torch.jit.trace(drop_block, x)

inconsistency with the original paper

Hello, thanks for your nice code!

I found there were 2 inconsistencies with the original paper, and they are very easy to fix indeed:

the gamma: in the original paper, all the block_mask are complete squares (or cubes), sinces its mask are only sampled on the central parts.
in the paper, it said the channels use different masks, while in your implement they use the same.

I just figure them out, actually I do not know whether they are effective tricks, there are insufficient details discussed in the paper :)

ImportError: cannot import name 'DropBlock2D'

python 3.6
pytorch 0.4.1
Collecting dropblock
Downloading https://files.pythonhosted.org/packages/42/e9/ea1afa72c7114685e6e971e23d68151eea00de171c2c7a6b9872c600be33/dropblock-0.1.0-py3-none-any.whl
Requirement already satisfied: numpy in /data/guoxiaobao/Anaconda3/envs/pytorch/lib/python3.6/site-packages (from dropblock)
Collecting torch==0.4.1 (from dropblock)
Downloading https://files.pythonhosted.org/packages/49/0e/e382bcf1a6ae8225f50b99cc26effa2d4cc6d66975ccf3fa9590efcbedce/torch-0.4.1-cp36-cp36m-manylinux1_x86_64.whl (519.5MB)
100% |████████████████████████████████| 519.5MB 2.7kB/s
Installing collected packages: torch, dropblock
Found existing installation: torch 0.4.0
Uninstalling torch-0.4.0:
Successfully uninstalled torch-0.4.0
Successfully installed dropblock-0.1.0 torch-0.4.1

Python 3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 13:51:32)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.

from dropblock import DropBlock2D
Traceback (most recent call last):
File "", line 1, in
ImportError: cannot import name 'DropBlock2D'

how can I fix this problem?

About feat_size?

Here, feat_size should be sqrt(width * height) ?

DropBlock1D

Can you please also include DropBlock1D implementation to use it for time-series? Thank you very much.

Would you please tell me how did you make certain the positive area of the mask keep still after max pooling operation ?

Hi,

Thanks for your work on this !!!

I am reading your code and found this line:

dropblock/dropblock/dropblock.py

Line 63 in 7fb8fbf

block_mask = F.max_pool2d(input=mask[:, None, :, :],

It seems that you generated random mask, and then apply max pooling operation to it. However, after testing the max pooling operation, I found the number of 1s in the mask cannot be kept same with this operation:

mask1 = torch.randint(0, 2, (1, 1, 256, 256)).float()
mask2 = F.max_pool2d(mask1, kernel_size=(5, 5), stride=1, padding=2)
print(mask1.sum())
print(mask2.sum())

The results is:

32715
65536

It seems that the proportion of the positive mask is changed with this operation. Would you please tell me how did you make sure the masked area is same after max pooling ?

miguelvr / dropblock Goto Github PK

dropblock's People

Contributors

Stargazers

Watchers

Forkers

dropblock's Issues

Recommend Projects

Recommend Topics

Recommend Org