Bayesian Uncertainty Estimation by Hamiltonian Monte Carlo: Applications to Cardiac MRI Segmentation

Deep learning (DL)-based methods have achieved state-of-the-art performance for many medical image segmentation tasks. Nevertheless, recent studies show that deep neural networks (DNNs) can be miscalibrated and overconfident, leading to "silent failures" that are risky for clinical applications. Bayesian DL provides an intuitive approach to DL failure detection, based on posterior probability estimation. However, the posterior is intractable for large medical image segmentation DNNs. To tackle this challenge, we propose a Bayesian learning framework using Hamiltonian Monte Carlo (HMC), tempered by cold posterior (CP) to accommodate medical data augmentation, named HMC-CP. For HMC computation, we further propose a cyclical annealing strategy, capturing both local and global geometries of the posterior distribution, enabling highly efficient Bayesian DNN training with the same computational budget as training a single DNN. The resulting Bayesian DNN outputs an ensemble segmentation along with the segmentation uncertainty. We evaluate the proposed HMC-CP extensively on cardiac magnetic resonance image (MRI) segmentation, using in-domain steady-state free precession (SSFP) cine images as well as out-of-domain datasets of quantitative T1 and T2 mapping. Our results show that the proposed method improves both segmentation accuracy and uncertainty estimation for in- and out-of-domain data, compared with well-established baseline methods such as Monte Carlo Dropout and Deep Ensembles. Additionally, we establish a conceptual link between HMC and the commonly known stochastic gradient descent (SGD) and provide general insight into the uncertainty of DL. This uncertainty is implicitly encoded in the training dynamics but often overlooked. With reliable uncertainty estimation, our method provides a promising direction toward trustworthy DL in clinical applications.

翻译：基于深度学习的方法已在众多医学图像分割任务中取得了最先进的性能。然而，近期研究表明深度神经网络可能存在校准错误和过度自信的问题，导致临床应用中存在风险的“静默失效”。贝叶斯深度学习基于后验概率估计，为深度学习失效检测提供了一种直观的途径。然而，对于大型医学图像分割深度神经网络，后验分布难以精确求解。为应对这一挑战，我们提出了一种采用哈密顿蒙特卡洛的贝叶斯学习框架，并通过冷后验进行温度调节以适应医学数据增强，该方法命名为HMC-CP。针对HMC计算，我们进一步提出了循环退火策略，该策略能同时捕捉后验分布的局部与全局几何特征，使得在仅需训练单个深度神经网络的计算开销下即可实现高效的贝叶斯深度神经网络训练。训练所得的贝叶斯深度神经网络可输出集成分割结果及相应的分割不确定性。我们在心脏磁共振图像分割任务上对提出的HMC-CP方法进行了全面评估，使用的数据包括稳态自由进动电影序列的域内图像，以及定量T1和T2 mapping的域外数据集。实验结果表明，与蒙特卡洛Dropout和深度集成等成熟基线方法相比，所提方法在域内和域外数据上均能同时提升分割精度与不确定性估计质量。此外，我们建立了HMC与常用随机梯度下降方法之间的概念联系，并对深度学习的不确定性提供了普适性见解。这种不确定性隐含在训练动态过程中却常被忽视。通过可靠的不确定性估计，我们的方法为临床应用中实现可信赖的深度学习提供了有前景的研究方向。