Git Product home page Git Product logo

dropblock's People

Contributors

hanochk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dropblock's Issues

scheduled dropblock?

Hi,

did you implement scheduled dropblock as mentioned in the paper?

In our experiments, we use a linear scheme of decreasing the value ofkeep_prob, which tends to work well across many hyperparameter settings. This linear scheme issimilar to ScheduledDropPath

The examples for using dropblock is different?

image
In this part, the setting for drop_prob in DropBlock2D(block_size = 3, drop_prob = 0.) is "0.", and the stop_value is set to "0.25". But in the example in resnet-cifar10.py(https://github.com/miguelvr/dropblock/tree/master/examples), the setting for drop_prob in DropBlock2D is shown as follows:
image
The parameters drop_prob and stop_value both take the idential setting by using "drop_prob", but in the first example, the setting for drop_prob and stop_value are 0. and 0.25 seperately. Which one is right? Thank you very much.

Negative block mask egde case

Description:

When the Bernoulli distribution samples two ones within their own block size range, the block mask gets a negative value.

Example:

    db = DropBlock2D(block_size=2, drop_prob=0.1)
    mask = torch.tensor([[[1., 0., 0., 0., 0.],
                          [0., 1., 0., 0., 0.],
                          [0., 0., 0., 0., 0.],
                          [0., 0., 0., 0., 0.],
                          [0., 0., 0., 0., 0.]]])

    block_mask = db._compute_block_mask(mask)

block_mask value:

tensor([[[ 1.,  1.,  1.,  1.,  1.,  1.],
         [ 1.,  0.,  0.,  1.,  1.,  1.],
         [ 1.,  0., -1.,  0.,  1.,  1.],
         [ 1.,  1.,  0.,  0.,  1.,  1.],
         [ 1.,  1.,  1.,  1.,  1.,  1.],
         [ 1.,  1.,  1.,  1.,  1.,  1.]]])

expected result:

tensor([[[ 1.,  1.,  1.,  1.,  1.,  1.],
         [ 1.,  0.,  0.,  1.,  1.,  1.],
         [ 1.,  0.,  0.,  0.,  1.,  1.],
         [ 1.,  1.,  0.,  0.,  1.,  1.],
         [ 1.,  1.,  1.,  1.,  1.,  1.],
         [ 1.,  1.,  1.,  1.,  1.,  1.]]])

the drop_size cannot be higher than 5?

padding=int(np.ceil(self.block_size // 2) + 1))

the implement of computer block mask using conv1d is to handle with the overlapped? and the output of _computer_block_mask must have the same size with x, right?

but when I try block_size=7, the output of block_block_mask is [N, 1, 1]. and when I try block_size=9, I will get an error.

In the function of computer block mask, after conv2d, the size of height and weight is
mask_size + 2*(block_size//2+1) - block_size + 1,
the height and weight of input x is mask_size + block_size//2.
and the former must be larger than the latter, so the (block_size//2) connot higher than 3?

Where to put the dropblock in the ResNet network?

The description in the paper said "We found that applying DropbBlock
in skip connections in addition to the convolution layers increases the accuracy." but in the example file "resnet-cifar10.py" provided in this repo, the places for plugging DropBlock is different, thank you so much for your help.

Segmentation?

Can Dropblock be used for image segmentation tasks?

About padding

hello,different padding strategy in DropBlock2D and DropBlock3D,something wrong? "padding=int(np.ceil(self.block_size / 2) + 1))" "padding=int(np.ceil(self.block_size // 2) + 1))"

Replace Bernoulli with creating random matrix

When I implemented my version of dropblock , I found that Bernoulli sampling can be extremely slow. Therefore I recommend replacing Bernoulli with random matrix.
An implementation may be like that:
mask = (torch.rand(x.shape[0], *mask_sizes)<gamma).float()

Use in segmentation. You suggest only for convolution feature extraction layer only. What do you mean?

When you say to only add it in the convolution feature extraction layer only. I just want to make sure I understand correctly what you meant. Did you mean (if I take the U-Net as an example):

  1. Single layer at the end of the encoder (that makes the hidden feature representation).
  2. All the layer of the encoder of the segmentation model (responsible to extract feature from the image)
  3. Layers of the encoder and decoder of the segmentation model (all hidden layers) (except the last one responsible for the prediction).

Thank you very much,

Originally posted by @Eric2Hamel in #18 (comment)

there is something wrong

dropout=DropBlock2D(0.2,block_size=3)
input=torch.randn((1,1,8,8))
output=dropout(input)

49 block_mask = self._compute_block_mask(mask)
50 # apply block mask
---> 51 out = x * block_mask[:, None, :, :]
52
53 # scale output

RuntimeError: The size of tensor a (8) must match the size of tensor b (9) at non-singleton dimension 3

DropBlock at Residual Connections

Hello, in the paper, they say that they also apply DropBlock to residual connections, I am searching for an example of this to not make a mistake. Can you please give an example how exactly that should be done?

'ResNetCustom' object has no attribute '_norm_layer'

Thank you for your excellent job about the dropblock. However, there are some errors when I run the code. It occured in the 34 line of resnet-cifar10.py, self.layer1 = self._make_layer(block, 64, layers[0]). I am puzzled about it.

torch.jit support

All network layers in this repo can be traced with torch.jit.trace(), but their control flow won't work correctly in traced modules (e.g. training/eval mode is not respected and the check in

if self.i < len(self.drop_values):
will be ignored if it's true during the tracing process), unlike PyTorch's native Dropout implementations. I've tried to fix this by rewriting the control flow explicitly in TorchScript, but I didn't get it to work yet.

Since tracing this code does not necessarily emit warnings (see example below), I think this incompatibility should be documented here to make sure no one mistakenly trains traced networks. OTOH networks that are traced in eval mode after a complete training should work as intended, as long as they are not put in training mode again.

Example code that produces a wrong TracedModule without any warning (using PyTorch 1.0.0):

import torch
from dropblock import DropBlock2D, LinearScheduler

drop_block = LinearScheduler(
    DropBlock2D(block_size=3, drop_prob=0.),
    start_value=0.,
    stop_value=0.25,
    nr_steps=5
)

x = torch.randn(1, 1, 8, 8)
traced = torch.jit.trace(drop_block, x)

inconsistency with the original paper

Hello, thanks for your nice code!

I found there were 2 inconsistencies with the original paper, and they are very easy to fix indeed:

  1. the gamma: in the original paper, all the block_mask are complete squares (or cubes), sinces its mask are only sampled on the central parts.
  2. in the paper, it said the channels use different masks, while in your implement they use the same.

I just figure them out, actually I do not know whether they are effective tricks, there are insufficient details discussed in the paper :)

ImportError: cannot import name 'DropBlock2D'

python 3.6
pytorch 0.4.1
Collecting dropblock
Downloading https://files.pythonhosted.org/packages/42/e9/ea1afa72c7114685e6e971e23d68151eea00de171c2c7a6b9872c600be33/dropblock-0.1.0-py3-none-any.whl
Requirement already satisfied: numpy in /data/guoxiaobao/Anaconda3/envs/pytorch/lib/python3.6/site-packages (from dropblock)
Collecting torch==0.4.1 (from dropblock)
Downloading https://files.pythonhosted.org/packages/49/0e/e382bcf1a6ae8225f50b99cc26effa2d4cc6d66975ccf3fa9590efcbedce/torch-0.4.1-cp36-cp36m-manylinux1_x86_64.whl (519.5MB)
100% |████████████████████████████████| 519.5MB 2.7kB/s
Installing collected packages: torch, dropblock
Found existing installation: torch 0.4.0
Uninstalling torch-0.4.0:
Successfully uninstalled torch-0.4.0
Successfully installed dropblock-0.1.0 torch-0.4.1

Python 3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 13:51:32)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.

from dropblock import DropBlock2D
Traceback (most recent call last):
File "", line 1, in
ImportError: cannot import name 'DropBlock2D'

how can I fix this problem?

DropBlock1D

Can you please also include DropBlock1D implementation to use it for time-series? Thank you very much.

Would you please tell me how did you make certain the positive area of the mask keep still after max pooling operation ?

Hi,

Thanks for your work on this !!!

I am reading your code and found this line:

block_mask = F.max_pool2d(input=mask[:, None, :, :],

It seems that you generated random mask, and then apply max pooling operation to it. However, after testing the max pooling operation, I found the number of 1s in the mask cannot be kept same with this operation:

mask1 = torch.randint(0, 2, (1, 1, 256, 256)).float()
mask2 = F.max_pool2d(mask1, kernel_size=(5, 5), stride=1, padding=2)
print(mask1.sum())
print(mask2.sum())

The results is:

32715
65536

It seems that the proportion of the positive mask is changed with this operation. Would you please tell me how did you make sure the masked area is same after max pooling ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.