Achieving robust uncertainty quantification for deep neural networks represents an important requirement in many real-world applications of deep learning such as medical imaging where it is necessary to assess the reliability of a neural network's prediction. Bayesian neural networks are a promising approach for modeling uncertainties in deep neural networks. Unfortunately, generating samples from the posterior distribution of neural networks is a major challenge. One significant advance in that direction would be the incorporation of adaptive step sizes, similar to modern neural network optimizers, into Monte Carlo Markov chain sampling algorithms without significantly increasing computational demand. Over the past years, several papers have introduced sampling algorithms with claims that they achieve this property. However, do they indeed converge to the correct distribution? In this paper, we demonstrate that these methods can have a substantial bias in the distribution they sample, even in the limit of vanishing step sizes and at full batch size.
翻译:实现深度神经网络的鲁棒不确定性量化,是医学影像等众多实际应用对深度学习提出的重要需求,因为这类场景需要评估神经网络预测结果的可靠性。贝叶斯神经网络为建模深度神经网络中的不确定性提供了颇具前景的途径。然而,从神经网络后验分布中生成样本仍是一项重大挑战。该方向的关键进展在于将自适应步长(类似于现代神经网络优化器)引入蒙特卡罗马尔可夫链采样算法,且不显著增加计算开销。近年来,多篇论文声称提出了具备该特性的采样算法。但这类方法是否确实能收敛至正确分布?本文证明,即使步长趋近于零且采用全批量大小,这些方法仍可能在采样的分布中引入显著偏差。