Channel Attention reigns supreme as an effective technique in the field of computer vision. However, the proposed channel attention by SENet suffers from information loss in feature learning caused by the use of Global Average Pooling (GAP) to represent channels as scalars. Thus, designing effective channel attention mechanisms requires finding a solution to enhance features preservation in modeling channel inter-dependencies. In this work, we utilize Wavelet transform compression as a solution to the channel representation problem. We first test wavelet transform as an Auto-Encoder model equipped with conventional channel attention module. Next, we test wavelet transform as a standalone channel compression method. We prove that global average pooling is equivalent to the recursive approximate Haar wavelet transform. With this proof, we generalize channel attention using Wavelet compression and name it WaveNet. Implementation of our method can be embedded within existing channel attention methods with a couple of lines of code. We test our proposed method using ImageNet dataset for image classification task. Our method outperforms the baseline SENet, and achieves the state-of-the-art results. Our code implementation is publicly available at https://github.com/hady1011/WaveNet-C.
翻译:通道注意力在计算机视觉领域中被公认为一种有效技术。然而,SENet提出的通道注意力由于使用全局平均池化(GAP)将通道表示为标量,导致特征学习中存在信息丢失。因此,设计有效的通道注意力机制需要在建模通道间依赖关系时找到增强特征保留的解决方案。在这项工作中,我们利用小波变换压缩作为通道表示问题的解决方案。我们首先测试了小波变换作为配备传统通道注意力模块的自编码器模型。接着,我们测试了小波变换作为独立的通道压缩方法。我们证明了全局平均池化等同于递归近似哈尔小波变换。基于这一证明,我们使用小波压缩推广了通道注意力,并将其命名为WaveNet。我们的方法实现可以通过少量代码嵌入到现有的通道注意力方法中。我们使用ImageNet数据集对图像分类任务进行了测试。该方法优于基线SENet,并达到了最先进的性能。我们的代码实现已在https://github.com/hady1011/WaveNet-C上公开。