Achieving robust uncertainty quantification for deep neural networks represents an important requirement in many real-world applications of deep learning such as medical imaging where it is necessary to assess the reliability of a neural network's prediction. Bayesian neural networks are a promising approach for modeling uncertainties in deep neural networks. Unfortunately, generating samples from the posterior distribution of neural networks is a major challenge. One significant advance in that direction would be the incorporation of adaptive step sizes, similar to modern neural network optimizers, into Monte Carlo Markov chain sampling algorithms without significantly increasing computational demand. Over the past years, several papers have introduced sampling algorithms with claims that they achieve this property. However, do they indeed converge to the correct distribution? In this paper, we demonstrate that these methods can have a substantial bias in the distribution they sample, even in the limit of vanishing step sizes and at full batch size.
翻译:对于深度神经网络而言,实现稳健的不确定性量化是许多实际应用(如医学影像)的重要需求,因为需要评估神经网络预测的可靠性。贝叶斯神经网络是建模深度神经网络不确定性的有效方法。然而,从神经网络后验分布中生成样本仍是一个重大挑战。该领域的一个关键进展是在不显著增加计算需求的前提下,将类似现代神经网络优化器的自适应步长引入马尔可夫链蒙特卡洛采样算法。过去几年中,多篇论文宣称提出了实现该特性的采样算法,但这些算法是否真正收敛于正确的分布?本文证明,即使步长趋近于零且采用全批次处理,这些方法仍可能在采样分布中引入显著偏差。