Bayesian neural networks often approximate the weight-posterior with a Gaussian distribution. However, practical posteriors are often, even locally, highly non-Gaussian, and empirical performance deteriorates. We propose a simple parametric approximate posterior that adapts to the shape of the true posterior through a Riemannian metric that is determined by the log-posterior gradient. We develop a Riemannian Laplace approximation where samples naturally fall into weight-regions with low negative log-posterior. We show that these samples can be drawn by solving a system of ordinary differential equations, which can be done efficiently by leveraging the structure of the Riemannian metric and automatic differentiation. Empirically, we demonstrate that our approach consistently improves over the conventional Laplace approximation across tasks. We further show that, unlike the conventional Laplace approximation, our method is not overly sensitive to the choice of prior, which alleviates a practical pitfall of current approaches.
翻译:贝叶斯神经网络通常使用高斯分布对权重后验进行近似。然而,实际后验分布往往(即使局部范围内)高度非高斯,导致经验性能下降。我们提出一种简单的参数化近似后验分布,它通过由对数后验梯度决定的黎曼度量来适应真实后验的形状。我们开发了一种黎曼拉普拉斯近似方法,其中样本自然落入具有低负对数后验的权重区域。我们证明这些样本可以通过求解常微分方程组来获取,这一过程可通过利用黎曼度量的结构和自动微分高效实现。实验表明,我们的方法在各项任务中均持续优于传统拉普拉斯近似。我们进一步证明,与传统拉普拉斯近似不同,我们的方法对先验的选择不敏感,这有效缓解了当前方法在实际应用中的棘手问题。