Large coordinate kernel attention network for lightweight image super-resolution

The multi-scale receptive field and large kernel attention (LKA) module have been shown to significantly improve performance in the lightweight image super-resolution task. However, existing lightweight super-resolution (SR) methods seldom pay attention to designing efficient building block with multi-scale receptive field for local modeling, and their LKA modules face a quadratic increase in computational and memory footprints as the convolutional kernel size increases. To address the first issue, we propose the multi-scale blueprint separable convolutions (MBSConv) as highly efficient building block with multi-scale receptive field, it can focus on the learning for the multi-scale information which is a vital component of discriminative representation. As for the second issue, we revisit the key properties of LKA in which we find that the adjacent direct interaction of local information and long-distance dependencies is crucial to provide remarkable performance. Thus, taking this into account and in order to mitigate the complexity of LKA, we propose a large coordinate kernel attention (LCKA) module which decomposes the 2D convolutional kernels of the depth-wise convolutional layers in LKA into horizontal and vertical 1-D kernels. LCKA enables the adjacent direct interaction of local information and long-distance dependencies not only in the horizontal direction but also in the vertical. Besides, LCKA allows for the direct use of extremely large kernels in the depth-wise convolutional layers to capture more contextual information, which helps to significantly improve the reconstruction performance, and it incurs lower computational complexity and memory footprints. Integrating MBSConv and LCKA, we propose a large coordinate kernel attention network (LCAN).

翻译：多尺度感受野和大核注意力(LKA)模块已被证明能显著提升轻量级图像超分辨率任务的性能。然而，现有轻量级超分辨率方法很少关注设计具有多尺度感受野的高效构建模块用于局部建模，且其LKA模块面临随着卷积核尺寸增大而呈二次增长的计算与内存开销。针对第一个问题，我们提出多尺度蓝图可分离卷积(MBSConv)作为具有多尺度感受野的高效构建模块，该模块能够聚焦于多尺度信息的学习——这是判别性表征的关键组成部分。针对第二个问题，我们重新审视LKA的关键特性，发现局部信息与长距离依赖的邻域直接交互对实现卓越性能至关重要。基于此，为降低LKA复杂度，我们提出大核坐标注意力(LCKA)模块，该模块将LKA中深度可分离卷积层的二维卷积核分解为水平与垂直方向的一维核。LCKA不仅实现了水平方向，还实现了垂直方向上局部信息与长距离依赖的邻域直接交互。此外，LCKA允许在深度可分离卷积层中直接使用极大尺寸核以捕获更多上下文信息，这有助于显著提升重建性能，同时降低计算复杂度和内存开销。通过整合MBSConv与LCKA，我们提出大核坐标注意力网络(LCAN)。