Neural network pruning has become increasingly crucial due to the complexity of neural network models and their widespread use in various fields. Existing pruning algorithms often suffer from limitations such as architecture specificity, excessive complexity and reliance on complex calculations, rendering them impractical for real-world applications. In this paper, we propose KEN: a straightforward, universal and unstructured pruning algorithm based on Kernel Density Estimation (KDE). KEN aims to construct optimized transformer models by selectively preserving the most significant parameters while restoring others to their pre-training state. This approach maintains model performance while allowing storage of only the optimized subnetwork, leading to significant memory savings. Extensive evaluations on seven transformer models demonstrate that KEN achieves equal or better performance than the original models with a minimum parameter reduction of 25%. In-depth comparisons against other pruning and PEFT algorithms confirm KEN effectiveness. Furthermore, we introduce KEN_viz, an explainable tool that visualizes the optimized model composition and the subnetwork selected by KEN.
翻译:神经网络剪枝因神经网络模型的复杂性及其在各领域的广泛应用而日益重要。现有剪枝算法常受限于架构特异性、过度复杂以及对复杂计算的依赖,导致其在实际应用中缺乏可行性。本文提出KEN:一种基于核密度估计(KDE)的直观、通用且非结构化的剪枝算法。KEN旨在通过选择性保留最显著参数,同时将其他参数恢复至预训练状态,从而构建优化的Transformer模型。该方法在维持模型性能的同时,仅需存储优化后的子网络,显著节省存储空间。对七个Transformer模型的大量评估表明,KEN在参数至少减少25%的情况下,实现了与原模型相当或更优的性能。与其他剪枝及PEFT算法的深入比较进一步验证了KEN的有效性。此外,我们引入了KEN_viz,一种可解释性工具,用于可视化优化模型组成及KEN选择的子网络。