This work presents a study on label noise in medical image segmentation by considering a noise model based on Gaussian field deformations. Such noise is of interest because it yields realistic looking segmentations and because it is unbiased in the sense that the expected deformation is the identity mapping. Efficient methods for sampling and closed form solutions for the marginal probabilities are provided. Moreover, theoretically optimal solutions to the loss functions cross-entropy and soft-Dice are studied and it is shown how they diverge as the level of noise increases. Based on recent work on loss function characterization, it is shown that optimal solutions to soft-Dice can be recovered by thresholding solutions to cross-entropy with a particular a priori unknown threshold that efficiently can be computed. This raises the question whether the decrease in performance seen when using cross-entropy as compared to soft-Dice is caused by using the wrong threshold. The hypothesis is validated in 5-fold studies on three organ segmentation problems from the TotalSegmentor data set, using 4 different strengths of noise. The results show that changing the threshold leads the performance of cross-entropy to go from systematically worse than soft-Dice to similar or better results than soft-Dice.
翻译:本研究针对医学图像分割中的标签噪声问题,通过考虑基于高斯场形变的噪声模型展开分析。此类噪声之所以具有研究价值,不仅因其能生成逼真的分割结果,更在于其具备无偏性——预期形变为恒等映射。研究提供了高效采样方法以及边缘概率的闭式解。此外,本文深入探讨了交叉熵与软Dice损失函数的理论最优解,并揭示了随着噪声强度增加两类损失函数最优解的分化规律。基于近期关于损失函数特征刻画的研究成果,我们证明可通过先验未知但可高效计算的特定阈值作用于交叉熵解,从而恢复软Dice的最优解。这引发了一个关键问题:交叉熵相比软Dice性能下降是否源于阈值选择不当?该假设在TotalSegmentor数据集的三类器官分割任务中,采用四种噪声强度进行五折交叉验证。结果表明:调整交叉熵的阈值后,其性能从系统性劣于软Dice转变为与软Dice相当甚至更优。