In this work, we propose a novel prior learning method for advancing generalization and uncertainty estimation in deep neural networks. The key idea is to exploit scalable and structured posteriors of neural networks as informative priors with generalization guarantees. Our learned priors provide expressive probabilistic representations at large scale, like Bayesian counterparts of pre-trained models on ImageNet, and further produce non-vacuous generalization bounds. We also extend this idea to a continual learning framework, where the favorable properties of our priors are desirable. Major enablers are our technical contributions: (1) the sums-of-Kronecker-product computations, and (2) the derivations and optimizations of tractable objectives that lead to improved generalization bounds. Empirically, we exhaustively show the effectiveness of this method for uncertainty estimation and generalization.
翻译:本文提出了一种新颖的先验学习方法,旨在提升深度神经网络的泛化能力与不确定性估计。核心思想是将神经网络的可扩展结构化后验作为具有泛化保证的信息性先验加以利用。我们学习到的先验能够在大规模尺度下提供具有表达性的概率表示(类似于ImageNet预训练模型的贝叶斯对应物),并进一步产生非平凡泛化界。我们还将这一思想扩展至持续学习框架,其中先验的有利性质尤为关键。主要技术贡献包括:(1) 克罗内克积和的计算方法,以及(2) 可导目标函数的推导与优化,从而获得更优的泛化界。实验部分全面展示了该方法在不确定性估计与泛化方面的有效性。