Deep learning algorithms are increasingly employed at the edge. However, edge devices are resource constrained and thus require efficient deployment of deep neural networks. Pruning methods are a key tool for edge deployment as they can improve storage, compute, memory bandwidth, and energy usage. In this paper we propose a novel accurate pruning technique that allows precise control over the output network size. Our method uses an efficient optimal transportation scheme which we make end-to-end differentiable and which automatically tunes the exploration-exploitation behavior of the algorithm to find accurate sparse sub-networks. We show that our method achieves state-of-the-art performance compared to previous pruning methods on 3 different datasets, using 5 different models, across a wide range of pruning ratios, and with two types of sparsity budgets and pruning granularities.
翻译:深度学习算法越来越多地部署在边缘设备上。然而,边缘设备资源受限,因此需要高效部署深度神经网络。剪枝方法是边缘部署的关键工具,因为它可以改善存储、计算、内存带宽和能源使用。本文提出了一种新颖且精确的剪枝技术,能够精确控制输出网络的规模。该方法采用高效的优化运输方案,我们将其实现为端到端可微分,并自动调整算法的探索-利用行为,以找到精确的稀疏子网络。实验表明,该方法在3个不同数据集上使用5种不同模型,在广泛的剪枝比例、两种稀疏度预算和剪枝粒度下,均达到了相比以往剪枝方法更优的性能。