Network complexity and computational efficiency have become increasingly significant aspects of deep learning. Sparse deep learning addresses these challenges by recovering a sparse representation of the underlying target function by reducing heavily over-parameterized deep neural networks. Specifically, deep neural architectures compressed via structured sparsity (e.g. node sparsity) provide low latency inference, higher data throughput, and reduced energy consumption. In this paper, we explore two well-established shrinkage techniques, Lasso and Horseshoe, for model compression in Bayesian neural networks. To this end, we propose structurally sparse Bayesian neural networks which systematically prune excessive nodes with (i) Spike-and-Slab Group Lasso (SS-GL), and (ii) Spike-and-Slab Group Horseshoe (SS-GHS) priors, and develop computationally tractable variational inference including continuous relaxation of Bernoulli variables. We establish the contraction rates of the variational posterior of our proposed models as a function of the network topology, layer-wise node cardinalities, and bounds on the network weights. We empirically demonstrate the competitive performance of our models compared to the baseline models in prediction accuracy, model compression, and inference latency.
翻译:网络复杂度与计算效率已成为深度学习日益重要的议题。稀疏深度学习通过压缩过度参数化的深度神经网络,恢复目标函数的稀疏表示,从而应对这些挑战。具体而言,通过结构化稀疏性(如节点稀疏性)压缩的深度神经架构可实现低延迟推理、更高数据吞吐量和更低能耗。本文探讨了两种成熟的收缩技术——Lasso和Horseshoe,用于贝叶斯神经网络的模型压缩。为此,我们提出结构化稀疏贝叶斯神经网络,通过以下两种先验系统性剪枝冗余节点:(i) 尖峰与平板组Lasso (Spike-and-Slab Group Lasso, SS-GL) 和 (ii) 尖峰与平板组Horseshoe (Spike-and-Slab Group Horseshoe, SS-GHS),并开发了包含伯努利变量连续松弛的可计算变分推理方法。我们建立了所提模型变分后验的收缩率与网络拓扑结构、逐层节点基数及网络权重约束的函数关系。实验表明,与基线模型相比,本模型在预测精度、模型压缩率和推理延迟方面均具有竞争性表现。