Channel pruning is widely accepted to accelerate modern convolutional neural networks (CNNs). The resulting pruned model benefits from its immediate deployment on general-purpose software and hardware resources. However, its large pruning granularity, specifically at the unit of a convolution filter, often leads to undesirable accuracy drops due to the inflexibility of deciding how and where to introduce sparsity to the CNNs. In this paper, we propose REPrune, a novel channel pruning technique that emulates kernel pruning, fully exploiting the finer but structured granularity. REPrune identifies similar kernels within each channel using agglomerative clustering. Then, it selects filters that maximize the incorporation of kernel representatives while optimizing the maximum cluster coverage problem. By integrating with a simultaneous training-pruning paradigm, REPrune promotes efficient, progressive pruning throughout training CNNs, avoiding the conventional train-prune-finetune sequence. Experimental results highlight that REPrune performs better in computer vision tasks than existing methods, effectively achieving a balance between acceleration ratio and performance retention.
翻译:通道剪枝被广泛用于加速现代卷积神经网络(CNNs)。剪枝后的模型可直接部署于通用软件和硬件资源,具有显著优势。然而,其以卷积滤波器为单位的大剪枝粒度,因难以灵活决定如何在CNNs中引入稀疏性,常导致不可忽视的精度下降。本文提出REPrune,一种模拟核剪枝的新型通道剪枝技术,充分利用更细粒度但结构化的剪枝模式。REPrune通过凝聚聚类识别每个通道内的相似核,随后选择能最大化集成核代表同时优化最大聚类覆盖问题的滤波器。通过与同步训练-剪枝范式结合,REPrune在CNN训练过程中实现高效渐进式剪枝,避免了传统的"训练-剪枝-微调"流程。实验结果表明,REPrune在计算机视觉任务中优于现有方法,有效实现了加速比与性能保持之间的平衡。