Inspired by the long-range modeling ability of ViTs, large-kernel convolutions are widely studied and adopted recently to enlarge the receptive field and improve model performance, like the remarkable work ConvNeXt which employs 7x7 depthwise convolution. Although such depthwise operator only consumes a few FLOPs, it largely harms the model efficiency on powerful computing devices due to the high memory access costs. For example, ConvNeXt-T has similar FLOPs with ResNet-50 but only achieves ~60% throughputs when trained on A100 GPUs with full precision. Although reducing the kernel size of ConvNeXt can improve speed, it results in significant performance degradation, which poses a challenging problem: How to speed up large-kernel-based CNN models while preserving their performance. To tackle this issue, inspired by Inceptions, we propose to decompose large-kernel depthwise convolution into four parallel branches along channel dimension, i.e., small square kernel, two orthogonal band kernels, and an identity mapping. With this new Inception depthwise convolution, we build a series of networks, namely IncepitonNeXt, which not only enjoy high throughputs but also maintain competitive performance. For instance, InceptionNeXt-T achieves 1.6x higher training throughputs than ConvNeX-T, as well as attains 0.2% top-1 accuracy improvement on ImageNet-1K. We anticipate InceptionNeXt can serve as an economical baseline for future architecture design to reduce carbon footprint. Code is available at https://github.com/sail-sg/inceptionnext.
翻译:受视觉Transformer(ViT)长程建模能力的启发,大核卷积近年来被广泛研究并采用,以扩大感受野并提升模型性能,例如采用7×7深度卷积的杰出工作ConvNeXt。尽管此类深度算子仅消耗少量浮点运算(FLOPs),但由于高内存访问成本,其在强大计算设备上的模型效率严重受损。例如,ConvNeXt-T与ResNet-50具有相近的FLOPs,但在A100 GPU上进行全精度训练时仅能达到约60%的吞吐量。虽然减小ConvNeXt的核尺寸可提升速度,但会导致显著的性能下降,这构成了一个具有挑战性的问题:如何在保持性能的同时加速基于大核的CNN模型?为解决此问题,受Inception架构启发,我们提出将大核深度卷积沿通道维度分解为四个并行分支,即小尺寸方形核、两个正交带状核以及恒等映射。基于这种新型Inception深度卷积,我们构建了一系列网络,称为InceptionNeXt,其不仅具有高吞吐量,同时保持了有竞争力的性能。例如,InceptionNeXt-T的训练吞吐量比ConvNeXt-T高1.6倍,并在ImageNet-1K上实现了0.2%的top-1准确率提升。我们期望InceptionNeXt能作为未来架构设计的经济型基线,以降低碳足迹。代码发布于https://github.com/sail-sg/inceptionnext。