Traditional channel-wise pruning methods by reducing network channels struggle to effectively prune efficient CNN models with depth-wise convolutional layers and certain efficient modules, such as popular inverted residual blocks. Prior depth pruning methods by reducing network depths are not suitable for pruning some efficient models due to the existence of some normalization layers. Moreover, finetuning subnet by directly removing activation layers would corrupt the original model weights, hindering the pruned model from achieving high performance. To address these issues, we propose a novel depth pruning method for efficient models. Our approach proposes a novel block pruning strategy and progressive training method for the subnet. Additionally, we extend our pruning method to vision transformer models. Experimental results demonstrate that our method consistently outperforms existing depth pruning methods across various pruning configurations. We obtained three pruned ConvNeXtV1 models with our method applying on ConvNeXtV1, which surpass most SOTA efficient models with comparable inference performance. Our method also achieves state-of-the-art pruning performance on the vision transformer model.
翻译:传统的通道级剪枝方法通过减少网络通道数,难以有效剪枝含有深度可分离卷积层及特定高效模块(如常见的倒残差块)的高效CNN模型。现有的深度剪枝方法因某些归一化层的存在,难以适用于部分高效模型的剪枝。此外,直接移除激活层来微调子网络会破坏原始模型权重,阻碍剪枝模型实现高性能。针对这些问题,我们提出了一种面向高效模型的新型深度剪枝方法。该方法采用创新的块级剪枝策略与渐进式子网络训练方案,并进一步将剪枝方法扩展至视觉Transformer模型。实验结果表明,本方法在多种剪枝配置下均持续优于现有深度剪枝方法。我们将该方法应用于ConvNeXtV1模型后获得三种剪枝版本,在可比推理性能下超越了多数最先进的高效模型。同时,本方法在视觉Transformer模型上亦达到了最先进的剪枝性能。