Data uncertainties, such as sensor noise, occlusions or limitations in the acquisition method can introduce irreducible ambiguities in images, which result in varying, yet plausible, semantic hypotheses. In Machine Learning, this ambiguity is commonly referred to as aleatoric uncertainty. In image segmentation, latent density models can be utilized to address this problem. The most popular approach is the Probabilistic U-Net (PU-Net), which uses latent Normal densities to optimize the conditional data log-likelihood Evidence Lower Bound. In this work, we demonstrate that the PU-Net latent space is severely sparse and heavily under-utilized. To address this, we introduce mutual information maximization and entropy-regularized Sinkhorn Divergence in the latent space to promote homogeneity across all latent dimensions, effectively improving gradient-descent updates and latent space informativeness. Our results show that by applying this on public datasets of various clinical segmentation problems, our proposed methodology receives up to 11% performance gains compared against preceding latent variable models for probabilistic segmentation on the Hungarian-Matched Intersection over Union. The results indicate that encouraging a homogeneous latent space significantly improves latent density modeling for medical image segmentation.
翻译:数据不确定性,如传感器噪声、图像遮挡或采集方法限制,会在图像中引入不可消除的歧义,从而产生多样但合理的语义假设。在机器学习领域,这种歧义通常被称为偶然不确定性。在图像分割中,潜密度模型可用于解决该问题。最流行的方法是概率U-Net(PU-Net),它利用潜正态密度优化条件数据对数似然的证据下界。本研究证明,PU-Net的潜空间存在严重稀疏性且利用率极低。为此,我们在潜空间中引入互信息最大化与熵正则化Sinkhorn散度,以促进所有潜维度的同质性,从而有效改善梯度下降更新过程及潜空间信息量。实验表明,在多种临床分割任务的公开数据集上,与先前的潜变量概率分割模型相比,本方法在匈牙利匹配交并比评估指标上获得最高11%的性能提升。结果证明,促进潜空间同质性可显著改善医学图像分割中的潜密度建模效果。