Neural networks based on convolutional operations have achieved remarkable results in the field of deep learning, but there are two inherent flaws in standard convolutional operations. On the one hand, the convolution operation be confined to a local window and cannot capture information from other locations, and its sampled shapes is fixed. On the other hand, the size of the convolutional kernel is fixed to k $\times$ k, which is a fixed square shape, and the number of parameters tends to grow squarely with size. It is obvious that the shape and size of targets are various in different datasets and at different locations. Convolutional kernels with fixed sample shapes and squares do not adapt well to changing targets. In response to the above questions, the Alterable Kernel Convolution (AKConv) is explored in this work, which gives the convolution kernel an arbitrary number of parameters and arbitrary sampled shapes to provide richer options for the trade-off between network overhead and performance. In AKConv, we define initial positions for convolutional kernels of arbitrary size by means of a new coordinate generation algorithm. To adapt to changes for targets, we introduce offsets to adjust the shape of the samples at each position. Moreover, we explore the effect of the neural network by using the AKConv with the same size and different initial sampled shapes. AKConv completes the process of efficient feature extraction by irregular convolutional operations and brings more exploration options for convolutional sampling shapes. Object detection experiments on representative datasets COCO2017, VOC 7+12 and VisDrone-DET2021 fully demonstrate the advantages of AKConv. AKConv can be used as a plug-and-play convolutional operation to replace convolutional operations to improve network performance. The code for the relevant tasks can be found at https://github.com/CV-ZhangXin/AKConv.
翻译:基于卷积操作的神经网络在深度学习领域取得了显著成果,但标准卷积操作存在两个固有缺陷。一方面,卷积操作局限于局部窗口,无法捕获其他位置的信息,且其采样形状固定。另一方面,卷积核的尺寸固定为k×k,即固定正方形形状,而参数数量通常随尺寸呈平方增长。显然,不同数据集以及不同位置的目标形状和大小各不相同。固定采样形状和正方形的卷积核难以适应变化的检测目标。针对上述问题,本文探索了可变核卷积(AKConv),该卷积核具有任意数量的参数和任意采样形状,为网络开销与性能之间的权衡提供了更丰富的选择。在AKConv中,我们通过一种新的坐标生成算法定义了任意大小卷积核的初始位置。为了适应目标的变化,我们引入偏移量来调整每个位置样本的形状。此外,我们探索了使用相同尺寸但不同初始采样形状的AKConv对神经网络的影响。AKConv通过非规则卷积操作完成高效特征提取,并为卷积采样形状带来更多探索选项。在代表性数据集COCO2017、VOC 7+12和VisDrone-DET2021上的目标检测实验充分证明了AKConv的优势。AKConv可作为即插即用的卷积操作替代标准卷积,以提升网络性能。相关任务代码可在https://github.com/CV-ZhangXin/AKConv获取。