Neural network sparsity has attracted many research interests due to its similarity to biological schemes and high energy efficiency. However, existing methods depend on long-time training or fine-tuning, which prevents large-scale applications. Recently, some works focusing on post-training sparsity (PTS) have emerged. They get rid of the high training cost but usually suffer from distinct accuracy degradation due to neglect of the reasonable sparsity rate at each layer. Previous methods for finding sparsity rates mainly focus on the training-aware scenario, which usually fails to converge stably under the PTS setting with limited data and much less training cost. In this paper, we propose a fast and controllable post-training sparsity (FCPTS) framework. By incorporating a differentiable bridge function and a controllable optimization objective, our method allows for rapid and accurate sparsity allocation learning in minutes, with the added assurance of convergence to a predetermined global sparsity rate. Equipped with these techniques, we can surpass the state-of-the-art methods by a large margin, e.g., over 30\% improvement for ResNet-50 on ImageNet under the sparsity rate of 80\%. Our plug-and-play code and supplementary materials are open-sourced at https://github.com/ModelTC/FCPTS.
翻译:神经网络稀疏性因其与生物机制的相似性和高能效而备受研究关注。然而现有方法依赖长时间训练或微调,阻碍了大规模应用。近期涌现出若干聚焦于训练后稀疏化(PTS)的研究工作,它们摆脱了高昂的训练成本,但往往因忽视各层合理稀疏率而导致显著的精度下降。现有稀疏率搜索方法主要面向训练感知场景,在有限数据和极低训练成本的PTS设定下常难以稳定收敛。本文提出一种快速可控的训练后稀疏化(FCPTS)框架。通过引入可微桥接函数与可控优化目标,我们的方法可在数分钟内实现快速精准的稀疏分配学习,并确保收敛至预设全局稀疏率。借助这些技术,我们能够大幅超越现有最优方法,例如在ImageNet数据集上对ResNet-50模型实现80%稀疏率时提升超过30%的性能。我们的即插即用代码及补充材料已在https://github.com/ModelTC/FCPTS 开源。