Deep learning algorithms are increasingly employed at the edge. However, edge devices are resource constrained and thus require efficient deployment of deep neural networks. Pruning methods are a key tool for edge deployment as they can improve storage, compute, memory bandwidth, and energy usage. In this paper we propose a novel accurate pruning technique that allows precise control over the output network size. Our method uses an efficient optimal transportation scheme which we make end-to-end differentiable and which automatically tunes the exploration-exploitation behavior of the algorithm to find accurate sparse sub-networks. We show that our method achieves state-of-the-art performance compared to previous pruning methods on 3 different datasets, using 5 different models, across a wide range of pruning ratios, and with two types of sparsity budgets and pruning granularities.
翻译:深度学习算法越来越多地应用于边缘设备。然而,边缘设备资源受限,因此需要高效部署深度神经网络。剪枝方法是边缘部署的关键工具,因为它可以改善存储、计算、内存带宽和能耗。本文提出了一种新颖且精确的剪枝技术,能够精确控制输出网络的大小。我们的方法采用高效的最优运输方案,将其实现为端到端可微,并自动调整算法的探索-利用行为,以找到精确的稀疏子网络。实验表明,在3个不同数据集、5种不同模型、广泛剪枝比率、两种稀疏预算和剪枝粒度下,我们的方法与以往剪枝方法相比达到了最先进的性能。