In recent years, deep network pruning has attracted significant attention in order to enable the rapid deployment of AI into small devices with computation and memory constraints. Pruning is often achieved by dropping redundant weights, neurons, or layers of a deep network while attempting to retain a comparable test performance. Many deep pruning algorithms have been proposed with impressive empirical success. However, existing approaches lack a quantifiable measure to estimate the compressibility of a sub-network during each pruning iteration and thus may under-prune or over-prune the model. In this work, we propose PQ Index (PQI) to measure the potential compressibility of deep neural networks and use this to develop a Sparsity-informed Adaptive Pruning (SAP) algorithm. Our extensive experiments corroborate the hypothesis that for a generic pruning procedure, PQI decreases first when a large model is being effectively regularized and then increases when its compressibility reaches a limit that appears to correspond to the beginning of underfitting. Subsequently, PQI decreases again when the model collapse and significant deterioration in the performance of the model start to occur. Additionally, our experiments demonstrate that the proposed adaptive pruning algorithm with proper choice of hyper-parameters is superior to the iterative pruning algorithms such as the lottery ticket-based pruning methods, in terms of both compression efficiency and robustness.
翻译:近年来,深度网络剪枝技术受到广泛关注,旨在实现人工智能在计算和存储受限的小型设备上的快速部署。剪枝通常通过移除深度网络中的冗余权重、神经元或层来实现,同时力求保持相近的测试性能。许多深度剪枝算法已被提出,并取得了令人瞩目的实证成功。然而,现有方法缺乏一种可量化的指标来估计每次剪枝迭代中子网络的可压缩性,因此可能导致模型被欠剪枝或过剪枝。本文提出PQ指数(PQI)来衡量深度神经网络的潜在可压缩性,并据此开发了一种基于稀疏性感知的自适应剪枝(SAP)算法。我们的广泛实验验证了以下假设:对于通用剪枝流程,当大型模型被有效正则化时,PQI先下降;当模型可压缩性达到一个似乎对应欠拟合开始出现的极限时,PQI再上升;随后,当模型崩溃且性能显著恶化时,PQI再次下降。此外,实验表明,所提出的自适应剪枝算法在适当选择超参数的情况下,在压缩效率和鲁棒性方面均优于迭代剪枝算法(如基于彩票假设的剪枝方法)。