Data uncertainties, such as sensor noise or occlusions, can introduce irreducible ambiguities in images, which result in varying, yet plausible, semantic hypotheses. In Machine Learning, this ambiguity is commonly referred to as aleatoric uncertainty. Latent density models can be utilized to address this problem in image segmentation. The most popular approach is the Probabilistic U-Net (PU-Net), which uses latent Normal densities to optimize the conditional data log-likelihood Evidence Lower Bound. In this work, we demonstrate that the PU- Net latent space is severely inhomogenous. As a result, the effectiveness of gradient descent is inhibited and the model becomes extremely sensitive to the localization of the latent space samples, resulting in defective predictions. To address this, we present the Sinkhorn PU-Net (SPU-Net), which uses the Sinkhorn Divergence to promote homogeneity across all latent dimensions, effectively improving gradient-descent updates and model robustness. Our results show that by applying this on public datasets of various clinical segmentation problems, the SPU-Net receives up to 11% performance gains compared against preceding latent variable models for probabilistic segmentation on the Hungarian-Matched metric. The results indicate that by encouraging a homogeneous latent space, one can significantly improve latent density modeling for medical image segmentation.
翻译:数据不确定性,例如传感器噪声或遮挡,可能导致图像中产生不可约的歧义,从而产生多样但合理的语义假设。在机器学习中,这种歧义通常被称为偶然不确定性。潜密度模型可用于解决图像分割中的这一问题。最流行的方法是概率U-Net(PU-Net),它利用潜正态密度来优化条件数据对数似然的证据下界。本研究表明,PU-Net的潜空间存在严重的不均匀性。结果,梯度下降的有效性受到抑制,模型对潜空间样本的位置变得极其敏感,导致预测存在缺陷。为解决这一问题,我们提出了Sinkhorn PU-Net(SPU-Net),该模型使用Sinkhorn散度来促进所有潜维度的均匀性,从而有效改善梯度下降更新和模型稳健性。我们的结果表明,在多种临床分割问题的公开数据集上应用该模型时,与先前的概率分割潜变量模型相比,SPU-Net在匈牙利匹配指标上的性能提升高达11%。这些结果表明,通过鼓励均匀的潜空间,可以显著改进医学图像分割中的潜密度建模。