Structured pruning compresses neural networks by reducing channels (filters) for fast inference and low footprint at run-time. To restore accuracy after pruning, fine-tuning is usually applied to pruned networks. However, too few remaining parameters in pruned networks inevitably bring a great challenge to fine-tuning to restore accuracy. To address this challenge, we propose a novel method that first linearly over-parameterizes the compact layers in pruned networks to enlarge the number of fine-tuning parameters and then re-parameterizes them to the original layers after fine-tuning. Specifically, we equivalently expand the convolution/linear layer with several consecutive convolution/linear layers that do not alter the current output feature maps. Furthermore, we utilize similarity-preserving knowledge distillation that encourages the over-parameterized block to learn the immediate data-to-data similarities of the corresponding dense layer to maintain its feature learning ability. The proposed method is comprehensively evaluated on CIFAR-10 and ImageNet which significantly outperforms the vanilla fine-tuning strategy, especially for large pruning ratio.
翻译:结构化剪枝通过减少通道(滤波器)压缩神经网络,以实现运行时快速推理和低内存占用。为恢复剪枝后模型的精度,通常需对剪枝网络进行微调。然而,剪枝网络中剩余参数过少会严重制约微调恢复精度的效果。针对该问题,本文提出一种新方法:首先对剪枝网络的紧凑层进行线性过参数化以增加微调参数量,随后在微调后将其重新参数化为原始层结构。具体而言,我们通过多个连续且不改变当前输出特征图的卷积/全连接层来等价扩展原始卷积/线性层。此外,利用保相似性知识蒸馏技术,促使过参数化模块学习对应稠密层的即时数据间相似性关系,从而维持其特征学习能力。本方法在CIFAR-10和ImageNet数据集上的综合评估表明,其显著优于传统微调策略,尤其在剪枝比例较大时表现更为突出。