Data uncertainties, such as sensor noise or occlusions, can introduce irreducible ambiguities in images, which result in varying, yet plausible, semantic hypotheses. In Machine Learning, this ambiguity is commonly referred to as aleatoric uncertainty. Latent density models can be utilized to address this problem in image segmentation. The most popular approach is the Probabilistic U-Net (PU-Net), which uses latent Normal densities to optimize the conditional data log-likelihood Evidence Lower Bound. In this work, we demonstrate that the PU- Net latent space is severely inhomogenous. As a result, the effectiveness of gradient descent is inhibited and the model becomes extremely sensitive to the localization of the latent space samples, resulting in defective predictions. To address this, we present the Sinkhorn PU-Net (SPU-Net), which uses the Sinkhorn Divergence to promote homogeneity across all latent dimensions, effectively improving gradient-descent updates and model robustness. Our results show that by applying this on public datasets of various clinical segmentation problems, the SPU-Net receives up to 11% performance gains compared against preceding latent variable models for probabilistic segmentation on the Hungarian-Matched metric. The results indicate that by encouraging a homogeneous latent space, one can significantly improve latent density modeling for medical image segmentation.
翻译:数据不确定性(如传感器噪声或遮挡)会在图像中引入不可约的歧义,进而产生多种不同但合理的语义假设。在机器学习中,这种歧义通常被称为偶然不确定性。潜在密度模型可用于解决图像分割中的这一问题。最流行的方法是概率U-Net(PU-Net),它利用潜在正态密度来优化条件数据对数似然的证据下界。在本工作中,我们证明PU-Net的潜在空间严重不均匀。因此,梯度下降的有效性受到抑制,模型对潜在空间样本的定位极为敏感,导致预测结果存在缺陷。为解决此问题,我们提出了Sinkhorn PU-Net(SPU-Net),它使用Sinkhorn散度来促进所有潜在维度的均匀性,从而有效改善梯度下降更新和模型鲁棒性。我们的结果表明,通过将这一方法应用于各种临床分割问题的公开数据集,SPU-Net在匈牙利匹配度量上相比先前的概率分割潜在变量模型获得了高达11%的性能提升。这些结果表明,通过鼓励均匀的潜在空间,可以显著改善医学图像分割中的潜在密度建模。