Neural networks have achieved remarkable performance in various application domains. Nevertheless, a large number of weights in pre-trained deep neural networks prohibit them from being deployed on smartphones and embedded systems. It is highly desirable to obtain lightweight versions of neural networks for inference in edge devices. Many cost-effective approaches were proposed to prune dense and convolutional layers that are common in deep neural networks and dominant in the parameter space. However, a unified theoretical foundation for the problem mostly is missing. In this paper, we identify the close connection between matrix spectrum learning and neural network training for dense and convolutional layers and argue that weight pruning is essentially a matrix sparsification process to preserve the spectrum. Based on the analysis, we also propose a matrix sparsification algorithm tailored for neural network pruning that yields better pruning result. We carefully design and conduct experiments to support our arguments. Hence we provide a consolidated viewpoint for neural network pruning and enhance the interpretability of deep neural networks by identifying and preserving the critical neural weights.
翻译:神经网络在各类应用领域取得了显著成效。然而,预训练深度神经网络中大量权重参数限制了其在智能手机与嵌入式系统上的部署。为实现在边缘设备上的高效推理,亟需获得轻量级神经网络。现有多种低开销方法被提出用于剪枝深度神经网络中常见且占据参数空间主导地位的密集层与卷积层。但该领域普遍缺乏统一的理论基础。本文揭示了密集层与卷积层的矩阵频谱学习与神经网络训练之间的紧密联系,论证了权重剪枝本质上是一种保持频谱的矩阵稀疏化过程。基于该分析,我们提出了一种针对神经网络剪枝的矩阵稀疏化算法,该算法可获得更优剪枝效果。我们精心设计并开展了实验以支撑上述论点。通过识别并保留关键神经权重,本文为神经网络剪枝提供了统一视角,并增强了深度神经网络的可解释性。