Dynamic convolution achieves better performance for efficient CNNs at the cost of negligible FLOPs increase. However, the performance increase can not match the significantly expanded number of parameters, which is the main bottleneck in real-world applications. Contrastively, mask-based unstructured pruning obtains a lightweight network by removing redundancy in the heavy network. In this paper, we propose a new framework, \textbf{Sparse Dynamic Convolution} (\textsc{SD-Conv}), to naturally integrate these two paths such that it can inherit the advantage of dynamic mechanism and sparsity. We first design a binary mask derived from a learnable threshold to prune static kernels, significantly reducing the parameters and computational cost but achieving higher performance in Imagenet-1K. We further transfer pretrained models into a variety of downstream tasks, showing consistently better results than baselines. We hope our SD-Conv could be an efficient alternative to conventional dynamic convolutions.
翻译:动态卷积在几乎不增加FLOPs的情况下,提升了高效CNN的性能。然而,性能的提升无法匹配显著增加的参数量,这成为实际应用中的主要瓶颈。相比之下,基于掩码的非结构化剪枝通过移除冗余参数获得了轻量级网络。本文提出一个新框架——**稀疏动态卷积**(\textsc{SD-Conv}),自然地融合这两种路径,从而继承动态机制与稀疏性的优势。我们首先设计了一个基于可学习阈值的二值掩码,用于修剪静态卷积核,在ImageNet-1K上显著降低了参数量和计算成本,同时实现了更高性能。我们进一步将预训练模型迁移至多种下游任务,结果一致优于基线模型。我们期望SD-Conv能够成为传统动态卷积的高效替代方案。