cfzd / fcanet Goto Github PK

View Code? Open in Web Editor NEW

473.0 473.0 100.0 6.7 MB

FcaNet: Frequency Channel Attention Networks

License: MIT License

Python 99.65% Shell 0.27% Dockerfile 0.03% Makefile 0.03% Batchfile 0.03%

fcanet's People

Contributors

Stargazers

Watchers

fcanet's Issues

关于 the L73-L83 in model/layer.py 中learnable tensor的问题

你好！

在使用

# learnable DCT init
self.register_parameter('weight', self.get_dct_filter(height, width, mapper_x, mapper_y, channel))
# learnable random init
self.register_parameter('weight', torch.rand(channel, height, width))

这两种初始化方法时，会出现以下bug：

TypeError: cannot assign 'torch.FloatTensor' object to parameter 'weight' (torch.nn.Parameter or None required)

help plz :)

channel groups?

Hi, 请问，如果通道不分组，是不是用以下代码就可以？

def get_dct_filter(self, tile_size_x, tile_size_y, mapper_x, mapper_y, channel):
dct_filter = torch.zeros(channel, tile_size_x, tile_size_y)
for i in range(channel):
for _, (u_x, v_y) in enumerate(zip(mapper_x, mapper_y)):
for t_x in range(tile_size_x):
for t_y in range(tile_size_y):
dct_filter[i, t_x, t_y] = self.build_filter(t_x, u_x, tile_size_x) * self.build_filter(
t_y, v_y, tile_size_y)
return dct_filter

is there anyone can operate this code on his own machine? does this code work?

最优频率分量问题

请问top里面的最优频率分量是不是只适用于77，一旦改变尺寸为4848 最优频率分量是否需要更换，还是仍起作用？请指点一下

2d dct FLOPs computing method

Hi, I noticed that in your paper you computed FCAnet model FLOPs.

I wonder how do you compute the FLOPs of 2d dct? Could you provide your formula or code?

Thanks!

get_freq_indices

您好，请问下get_freq_indices这个函数中那些indices数组是什么作用阿，没太看懂其中规律。
感谢！

这个不是和SeNet差不多吗？为啥不可以做成即插即用的注意力模块

这个模块不知道怎么使用呀？有建议吗

你好，我是一名深度学习初学者，我添加了两个FCA模块使原模型的mIOU提升了2.3，效果很好；
然而对于通道分组，我有一些其他的看法；
如果分组的通道中表示不同的信息，每个分组再使用不同的频率分量，这似乎会造成更多的信息丢失吧，因为DCT可以看作是一种加权和，可以从论文中看到除了GAP是对每个通道上像素的一视同仁，其他的都是对空间上某一个或几个部分注意的更多，这显然是存在偏颇的，这似乎也能解释为什么单个频率分量实验中GAP的效果最好；在这种情况下，对通道进行分组，或许会造成更多的信息损失？
我仔细思考了下，我认为FCAwork的原因主要是存在通道冗余以及DCT加权形成的一种“互补”
因为存在通道冗余，进行通道分组时可能某些分组中的信息相近，并且这些分组的权重是“互补”的，比如一个权重矩阵更注重左半边，一个更注重右半边这样。似乎模块学习这种‘稀疏’的关系效果会更好。
可以认为FAC比SE更充分的使用了冗余的通道。
考虑了两个实验来证明，
不对减小输入的通道数，将FCA与原模型或是SE进行对比，当通道减少到一定程度时，信息没有那么冗余，这时应该会有大量的信息丢失，精度相较于原模型更低；
关于频率分量的选择，选取某些“对称”“互补”的权重矩阵，而不是通过单个频率分量的性能的来选择，并且去除那些"混乱”的权重矩阵，因为单个频率分量证明这种混乱的权重并没有简单分块的效果好
另外可以在大通道数使用大的分组，在小通道数使用小的分组，来检验是否会获得更好的性能

不能完全表达我的意思，如有错误，恳请指出！

gap和[0,0]

在代码里怎么验证[0,0]分量就是gap

dct_h and dct_w

How can I set dct_h and dct_w if i want to add FCA layer into another model. My feature maps for the layer I want to inset Fca layer are 160x160, 80x80, 40x40, 20x20

Please advise.

What's the difference between FcaBottleneck and FcaBasicBlock ?

As in your code, the FcaBottleneck expansion is 4 and FcaBasicBlock is 1, FcaBottleneck has one more layer of convolution than FcaBasicBlock, so how should I choose which module to use ?

一维的GAP是否可以被视为一维DCT的特例呢

您好，我已拜读了您的文章，其中提供了关于二维的证明公式。
那么请问一维的全局平均池化（GAP）是否可以被视为一维离散余弦变换（DCT）的特例呢？

visualization of papar Figure 5 & papar Figure 6

Hi，I've read your paper. It's a good job .
Could you provide your visualization code of papar(FcaNet) Figure 5 & papar Figure 6 ?

Thanks a lot!

get_dct_weights() cannot be found

不是说和SENet相比就修改一行代码吗，而且找不到get_dct_weights()这个函数

c2wh = dict([(64,56), ( 128,28), (256,14) ,(512,7)]) 是否需要改动问题

我的输入尺寸是（64，48，48）若使用原始c2wh = dict([(64,56), ( 128,28), (256,14) ,(512,7)]) ，再用adaptive_avg_pool2d() 由48变为56（小尺寸变大尺寸），会不会损害性能呢，应该怎么处理较好呢？麻烦赐教

关于split成n份的目的

你好，我想请教关于split的问题。论文里说将输入channels分成n个部分来分别应用不同频率的DCTfilter，这里分成n个部分是出于什么目的？为什么不是对所有channels都应用一遍不同的DCTfilter？是出于计算量的考虑吗？

有关self.dct_h和self.dct_w的设置？

在这个类中MultiSpectralAttentionLayer有以下部分。
if h != self.dct_h or w != self.dct_w:
x_pooled = torch.nn.functional.adaptive_avg_pool2d(x, (self.dct_h, self.dct_w))
# If you have concerns about one-line-change, don't worry. :)
# In the ImageNet models, this line will never be triggered.
# This is for compatibility in instance segmentation and object detection.

如果我的任务是目标检测，我该怎么设置self.dct_h和self.dct_w？

SE-NET基础上加FCA

self.register_buffer('weight', self.get_dct_weights())
self.fc = nn.Sequential(
nn.Linear(c2, c2 // reduction, bias=False),
nn.ReLU(inplace=True),
nn.Linear(c2 // reduction, c2, bias=False),
nn.Sigmoid()
)
这个函数get_dct_weights()的参数如何设置

关于模型精度

您好，非常感谢您的杰出工作，我使用了您的代码和模型在imagenet上FcaNet50得到的验证集准确率为78.39，请问是正常的吗，环境是Ubuntu20.04，cuda11.6，pytorch1.10，4卡3090，期待您的回复

通道切分

您好，看了您的代码，但是没有找到通道切分是在哪里进行的，方便赐教一下吗？

launch.py: error: argument --nproc_per_node: invalid int value: ''

I am running training data. but got this error. do you know how to deal with?

Regarding the setting of c2wh = dict([(64,56), (128,28), (256,14) ,(512,7)]) in the fcanet.py file.

In my model, the output feature map shape is (512, 16, 16), but I am worried that the adaptive_avg_pool2d() operation in the layer.py file will cause information loss. So I want to ask if the parameter: c2wh = dict([(64,56), ( 128,28), (256,14) ,(512,7)]) need to be changed?

向量维度

top1和top2操作后出来的向量维度是一样的吗

是否能提供一下对比实验中ResNet50的结果权重呢？

作者您好：
您是否方便提供一下论文表1中的ResNet50您训练所得的top-1为77.27的结果权重呢？万分感谢。

get_freq_indices?

Hello, could you please tell me how to calculate the indices of the three methods in the code?thank you！

低频分量

我想问一下这个可以直接提取图片的低频分量吗，会比可学习的DCT更好吗

如何理解和解释，固定的DCT比可学习的方式更好？

如题。如何理解和解释，固定的DCT比可学习的方式更好？也就是说，网络无法学习或者很难学习到一个更好的结果，来作为频率分量模板？

Implementation MultiSpectralAttentionLayer in Tensorflow

Hi author, thank for you great work. I'm implementing MultiSpectralAttentionLayer using Tensorflow, but I having some trouble with MultiSpectralAttentionLayer(MSA) making the trainning process quite slow, I think there was a mistake in re-implementing MSA. I cannot find alter for register_buffer to create fixed DCT init in Tensorflow so it make problem. Can you review it?

def get_freq_indices(method):
    assert method in ['top1', 'top2', 'top4', 'top8', 'top16', 'top32',
                      'bot1', 'bot2', 'bot4', 'bot8', 'bot16', 'bot32',
                      'low1', 'low2', 'low4', 'low8', 'low16', 'low32']
    num_freq = int(method[3:])
    if 'top' in method:
        all_top_indices_x = [0, 0, 6, 0, 0, 1, 1, 4, 5, 1, 3, 0, 0, 0, 3, 2, 4, 6, 3, 5, 5, 2, 6, 5, 5, 3, 3, 4, 2, 2, 6, 1]
        all_top_indices_y = [0, 1, 0, 5, 2, 0, 2, 0, 0, 6, 0, 4, 6, 3, 5, 2, 6, 3, 3, 3, 5, 1, 1, 2, 4, 2, 1, 1, 3, 0, 5, 3]
        mapper_x = all_top_indices_x[:num_freq]
        mapper_y = all_top_indices_y[:num_freq]
    elif 'low' in method:
        all_low_indices_x = [0, 0, 1, 1, 0, 2, 2, 1, 2, 0, 3, 4, 0, 1, 3, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4]
        all_low_indices_y = [0, 1, 0, 1, 2, 0, 1, 2, 2, 3, 0, 0, 4, 3, 1, 5, 4, 3, 2, 1, 0, 6, 5, 4, 3, 2, 1, 0, 6, 5, 4, 3]
        mapper_x = all_low_indices_x[:num_freq]
        mapper_y = all_low_indices_y[:num_freq]
    elif 'bot' in method:
        all_bot_indices_x = [6, 1, 3, 3, 2, 4, 1, 2, 4, 4, 5, 1, 4, 6, 2, 5, 6, 1, 6, 2, 2, 4, 3, 3, 5, 5, 6, 2, 5, 5, 3, 6]
        all_bot_indices_y = [6, 4, 4, 6, 6, 3, 1, 4, 4, 5, 6, 5, 2, 2, 5, 1, 4, 3, 5, 0, 3, 1, 1, 2, 4, 2, 1, 1, 5, 3, 3, 3]
        mapper_x = all_bot_indices_x[:num_freq]
        mapper_y = all_bot_indices_y[:num_freq]
    else:
        raise NotImplementedError
    return mapper_x, mapper_y

class MultiSpectralAttentionLayer(tf.keras.layers.Layer):
    def __init__(self, channel, dct_h, dct_w, reduction=16, freq_sel_method='top16'):
        super(MultiSpectralAttentionLayer, self).__init__()
        self.reduction = reduction
        self.dct_h = dct_h
        self.dct_w = dct_w

        mapper_x, mapper_y = get_freq_indices(freq_sel_method)
        self.num_split = len(mapper_x)
        mapper_x = [temp_x * (dct_h // 7) for temp_x in mapper_x]
        mapper_y = [temp_y * (dct_w // 7) for temp_y in mapper_y]

        self.dct_layer = MultiSpectralDCTLayer(dct_h, dct_w, mapper_x, mapper_y, channel)
        self.fc = tf.keras.Sequential([
            tf.keras.layers.Dense(channel // reduction, use_bias=False),
            tf.keras.layers.ReLU(),
            tf.keras.layers.Dense(channel, use_bias=False),
            tf.keras.layers.Activation('sigmoid')
        ])

    def call(self, x):
        n, h, w, c = x.shape
        x_pooled = x
        if h != self.dct_h or w != self.dct_w:
            x_pooled = tf.image.resize(x, (self.dct_h, self.dct_w))
        y = self.dct_layer(x_pooled)
        y = self.fc(y)
        y = tf.expand_dims(tf.expand_dims(y, axis=1), axis=1)
        return x * y

class MultiSpectralDCTLayer(tf.keras.layers.Layer):
    def __init__(self, height, width, mapper_x, mapper_y, channel):
        super(MultiSpectralDCTLayer, self).__init__()

        assert len(mapper_x) == len(mapper_y)
        assert channel % len(mapper_x) == 0

        self.num_freq = len(mapper_x)
        self.height = height
        self.width = width
        self.mapper_x = mapper_x
        self.mapper_y = mapper_y
        self.channel = channel
        self.weight = tf.Variable(initial_value=self.get_dct_filter(), trainable=False, name='weight') # In your model, you used self.register_buffer to create fixed DCT init and I cannot find alter in Tensorflow

    def call(self, x):
        x = x * self.weight
        result = tf.reduce_sum(x, axis=[2, 3])
        return result

    def build_filter(self, pos, freq, POS):
        result = math.cos(math.pi * freq * (pos + 0.5) / POS) / math.sqrt(POS)
        if freq == 0:
            return result
        else:
            return result * math.sqrt(2)

    def get_dct_filter(self):
        dct_filter = np.zeros((self.height, self.width, self.channel))
        c_part = self.channel // self.num_freq

        for i, (u_x, v_y) in enumerate(zip(self.mapper_x, self.mapper_y)):
            for t_x in range(self.height):
                for t_y in range(self.width):
                    dct_filter[t_x, t_y, i * c_part: (i + 1) * c_part] = \
                        self.build_filter(t_x, u_x, self.height) * self.build_filter(t_y, v_y, self.width)

        return tf.constant(dct_filter, dtype=tf.float32)

频率分量的确定

想请问一下，这些频率分量是怎么确定的呀？

model.init

i can not find your model init code? can any body tell me?thanks

修改为三维

您好，请问如果想要在三维网络中应用FCA应该如何改动

不大一致

在layer.py中有：
class MultiSpectralAttentionLayer(torch.nn.Module):中有
self.dct_layer = MultiSpectralDCTLayer(dct_h, dct_w, mapper_x, mapper_y, channel)
可见dct_h在前， dct_w在后就是h在前，w在后
而在class MultiSpectralDCTLayer(nn.Module):中
def init(self, width, height, mapper_x, mapper_y, channel):
可见 width在前，height在后，就是w在前，h在后
请问这有什么说处么？我晕了

Official code for FCANet?

Hi, @cfzd
I have paid attention to FCANet from last year, so is this official code?

关于特定频率分量的选择

get_freq_indices函数中的列表内容是用什么方法预先定义的呢，因为我想把您的工作迁移到一维数据上面

跑您的模型的时候遇到的一些问题

您好，非常欣赏您的idea，所以尝试跑一下您的分类模型。
我下载了ImageNet2012数据集之后，尝试启动您的模型，遇到了以下问题，想请教一下是否我的哪些设置出错了？

错误信息如下：
Traceback (most recent call last):
File "main.py", line 643, in
main()
File "main.py", line 389, in main
avg_train_time = train(train_loader, model, criterion, optimizer, epoch, logger, scheduler)
File "main.py", line 471, in train
prec1, prec5 = accuracy(output.data, target, topk=(1, 5))
File "main.py", line 631, in accuracy
correct_k = correct[:k].view(-1).float().sum(0, keepdim=True)
RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

elif 'bot' in method:
    all_bot_indices_x = [6,1,3,3,2,4,1,2,4,4,5,1,4,6,2,5,6,1,6,2,2,4,3,3,5,5,6,2,5,5,3,6]
    all_bot_indices_y = [6,4,4,6,6,3,1,4,4,5,6,5,2,2,5,1,4,3,5,0,3,1,1,2,4,2,1,1,5,3,3,3]

selecting frequency components

Hi, I want to know how did you select the frequency components like Figure6?
I want to select 1, 3, 6, 10 frequencies like zigzag DCT.

And, I want to know the meaning of the numbers in the layer.py.

num_freq = int(method[3:])
if 'top' in method:
    all_top_indices_x = [0,0,6,0,0,1,1,4,5,1,3,0,0,0,3,2,4,6,3,5,5,2,6,5,5,3,3,4,2,2,6,1]
    all_top_indices_y = [0,1,0,5,2,0,2,0,0,6,0,4,6,3,5,2,6,3,3,3,5,1,1,2,4,2,1,1,3,0,5,3]
    mapper_x = all_top_indices_x[:num_freq]
    mapper_y = all_top_indices_y[:num_freq]
elif 'low' in method:
    all_low_indices_x = [0,0,1,1,0,2,2,1,2,0,3,4,0,1,3,0,1,2,3,4,5,0,1,2,3,4,5,6,1,2,3,4]
    all_low_indices_y = [0,1,0,1,2,0,1,2,2,3,0,0,4,3,1,5,4,3,2,1,0,6,5,4,3,2,1,0,6,5,4,3]
    mapper_x = all_low_indices_x[:num_freq]
    mapper_y = all_low_indices_y[:num_freq]
elif 'bot' in method:
    all_bot_indices_x = [6,1,3,3,2,4,1,2,4,4,5,1,4,6,2,5,6,1,6,2,2,4,3,3,5,5,6,2,5,5,3,6]
    all_bot_indices_y = [6,4,4,6,6,3,1,4,4,5,6,5,2,2,5,1,4,3,5,0,3,1,1,2,4,2,1,1,5,3,3,3]
    mapper_x = all_bot_indices_x[:num_freq]
    mapper_y = all_bot_indices_y[:num_freq]
else:
    raise NotImplementedError
return mapper_x, mapper_y

cfzd / fcanet Goto Github PK

fcanet's People

Contributors

Stargazers

Watchers

Forkers

fcanet's Issues

Recommend Projects

Recommend Topics

Recommend Org