Data uncertainties, such as sensor noise or occlusions, can introduce irreducible ambiguities in images, which result in varying, yet plausible, semantic hypotheses. In Machine Learning, this ambiguity is commonly referred to as aleatoric uncertainty. Latent density models can be utilized to address this problem in image segmentation. The most popular approach is the Probabilistic U-Net (PU-Net), which uses latent Normal densities to optimize the conditional data log-likelihood Evidence Lower Bound. In this work, we demonstrate that the PU- Net latent space is severely inhomogenous. As a result, the effectiveness of gradient descent is inhibited and the model becomes extremely sensitive to the localization of the latent space samples, resulting in defective predictions. To address this, we present the Sinkhorn PU-Net (SPU-Net), which uses the Sinkhorn Divergence to promote homogeneity across all latent dimensions, effectively improving gradient-descent updates and model robustness. Our results show that by applying this on public datasets of various clinical segmentation problems, the SPU-Net receives up to 11% performance gains compared against preceding latent variable models for probabilistic segmentation on the Hungarian-Matched metric. The results indicate that by encouraging a homogeneous latent space, one can significantly improve latent density modeling for medical image segmentation.
翻译:数据不确定性,如传感器噪声或遮挡,会在图像中引入不可约的歧义性,导致不同但合理的语义假设。在机器学习中,这种歧义性通常被称为偶然不确定性。潜密度模型可用于解决图像分割中的这一问题。最流行的方法是概率U-Net(PU-Net),它利用潜正态密度来优化条件数据对数似然的证据下界。在本研究中,我们证明PU-Net的潜空间存在严重的不均匀性。因此,梯度下降的效果受到抑制,模型对潜空间样本的位置极度敏感,从而导致有缺陷的预测。为了解决这一问题,我们提出了Sinkhorn PU-Net(SPU-Net),它使用Sinkhorn散度来促进所有潜维度的均匀性,从而有效改善梯度下降更新和模型鲁棒性。我们的结果表明,在多个临床分割问题的公共数据集上应用该方法后,SPU-Net在匈牙利匹配指标上的性能相比之前的潜变量模型最多提升11%。这些结果表明,通过促进潜空间的均匀性,可以显著改善医学图像分割中的潜密度建模。