The alignseg from speedinghzl

A possible problem of the sampling operator in CAB and CAM

Dear author:
Thanks for your excellent work! When I tried to follow the AlignFA module, I found that the bilinear_interpolate_torch_gridsample function didn't sample the features as I expected before. In one word, I think the normalization factors for the vertical and horizontal coordinates are reversed.
And I conduct a toy experiment to show this possible problem in the function:
For a tensor of shape (1, 1, 4, 8) like:

tensor([[[[ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.],
          [ 8.,  9., 10., 11., 12., 13., 14., 15.],
          [16., 17., 18., 19., 20., 21., 22., 23.],
          [24., 25., 26., 27., 28., 29., 30., 31.]]]])

when I want to sample the value from the pixel under the source pixel, I 'predict' a delta tensor (of shape (1, 2, 4, 8)) like:

tensor([[[[0, 0, 0, 0, 0, 0, 0, 0],
          [0, 0, 0, 0, 0, 0, 0, 0],
          [0, 0, 0, 0, 0, 0, 0, 0],
          [0, 0, 0, 0, 0, 0, 0, 0]],

         [[1, 1, 1, 1, 1, 1, 1, 1],
          [1, 1, 1, 1, 1, 1, 1, 1],
          [1, 1, 1, 1, 1, 1, 1, 1],
          [1, 1, 1, 1, 1, 1, 1, 1]]]])

And the expected output is:

tensor([[[[ 8.0000,  9.0000, 10.0000, 11.0000, 12.0000, 13.0000, 14.0000, 15.0000],
          [16.0000, 17.0000, 18.0000, 19.0000, 20.0000, 21.0000, 22.0000, 23.0000],
          [24.0000, 25.0000, 26.0000, 27.0000, 28.0000, 29.0000, 30.0000, 31.0000],
          [ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000, 0.0000]]]])

But when I fed these into the bilinear_interpolate_torch_gridsample, I got:

tensor([[[[ 1.5000,  2.5000,  3.5000,  4.5000,  5.5000,  6.5000,  7.5000, 8.5000],
          [ 9.5000, 10.5000, 11.5000, 12.5000, 13.5000, 14.5000, 15.5000, 16.5000],
          [17.5000, 18.5000, 19.5000, 20.5000, 21.5000, 22.5000, 23.5000, 24.5000],
          [19.5000, 20.3125, 21.1250, 21.9375, 22.7500, 23.5625, 24.3750, 25.1875]]]])

It is noted that I had set align_corners=True, which is the default behavior up to Pytorch=1.2.0, since I use the Pytorch=1.7.0. I think it is because of a wrong normalization, specifically, the width of the tensor is used to normalize the vertical coordinates. The results can be explained by:
\delta_{v} = 1 / (w / s) * (h-1) / 2 = 1 / (8 / 1) * (4 - 1) / 2 = 1.5 / 8

So if we ignore the scale factor s and the difference between (w, h) and (w-1, h-1), a more reasonable function is like:

def bilinear_interpolate_torch_gridsample_new(input, size, delta=0):
    out_h, out_w = size
    n, c, h, w = input.shape
    s = 1.0
    norm = torch.tensor([[[[w/s, h/s]]]]).type_as(input).to(input.device) # not [h/s, w/s]
    w_list = torch.linspace(-1.0, 1.0, out_h).view(-1, 1).repeat(1, out_w)
    h_list = torch.linspace(-1.0, 1.0, out_w).repeat(out_h, 1)
    grid = torch.cat((h_list.unsqueeze(2), w_list.unsqueeze(2)), 2)
    grid = grid.repeat(n, 1, 1, 1).type_as(input).to(input.device)
    grid = grid + delta.permute(0, 2, 3, 1) / norm

    output = F.grid_sample(input, grid, align_corners=True)
    return output

and the output of this function is:

tensor([[[[ 3.0000,  4.0000,  5.0000,  6.0000,  7.0000,  8.0000,  9.0000, 10.0000],
          [11.0000, 12.0000, 13.0000, 14.0000, 15.0000, 16.0000, 17.0000, 18.0000],
          [19.0000, 20.0000, 21.0000, 22.0000, 23.0000, 24.0000, 25.0000, 26.0000],
          [15.0000, 15.6250, 16.2500, 16.8750, 17.5000, 18.1250, 18.7500, 19.3750]]]])

So is it a special design (because the final prediction results seem good using your raw code), a problem caused by the version, or a real problem?

which is possibly caused by the torch version problem.

speedinghzl / alignseg Goto Github PK

alignseg's Introduction

alignseg's People

Stargazers

Watchers

Forkers

alignseg's Issues

A possible problem of the sampling operator in CAB and CAM

文章中F和A的偏移量是一起计算然后分开的，代码中是直接分开计算的，这个地方有影响吗

Some figures

when to release the code

Pytorch version

what's means of RRB, CAM, CAB?

Aligned Feature Aggregation module differs from one mentioned in the article

What is the pytorch version?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent