Bayesian neural networks (BNNs) are a principled approach to modeling predictive uncertainties in deep learning, which are important in safety-critical applications. Since exact Bayesian inference over the weights in a BNN is intractable, various approximate inference methods exist, among which sampling methods such as Hamiltonian Monte Carlo (HMC) are often considered the gold standard. While HMC provides high-quality samples, it lacks interpretable summary statistics because its sample mean and variance is meaningless in neural networks due to permutation symmetry. In this paper, we first show that the role of permutations can be meaningfully quantified by a number of transpositions metric. We then show that the recently proposed rebasin method allows us to summarize HMC samples into a compact representation that provides a meaningful explicit uncertainty estimate for each weight in a neural network, thus unifying sampling methods with variational inference. We show that this compact representation allows us to compare trained BNNs directly in weight space across sampling methods and variational inference, and to efficiently prune neural networks trained without explicit Bayesian frameworks by exploiting uncertainty estimates from HMC.
翻译:贝叶斯神经网络(BNN)是深度学习中对预测不确定性进行建模的一种原则性方法,在安全关键应用中尤为重要。由于对BNN中的权重进行精确贝叶斯推断在计算上不可行,学术界已发展出多种近似推断方法,其中哈密顿蒙特卡洛(HMC)等采样方法常被视为黄金标准。尽管HMC能生成高质量样本,但由于置换对称性导致神经网络中的样本均值和方差失去意义,该方法缺乏可解释的汇总统计量。本文首先证明,置换的作用可通过换位度量(transpositions metric)进行有意义的量化。随后我们证明,近期提出的重基(rebasin)方法能够将HMC样本汇总为一种紧凑表示,该表示为神经网络中每个权重提供有意义的显式不确定性估计,从而统一了采样方法与变分推断。研究表明,这种紧凑表示使我们能够直接在权重空间中比较通过不同采样方法和变分推断训练的BNN,并通过利用HMC的不确定性估计,实现对未采用显式贝叶斯框架训练的神经网络的高效剪枝。