Uncertainty in medical image segmentation tasks, especially inter-rater variability, arising from differences in interpretations and annotations by various experts, presents a significant challenge in achieving consistent and reliable image segmentation. This variability not only reflects the inherent complexity and subjective nature of medical image interpretation but also directly impacts the development and evaluation of automated segmentation algorithms. Accurately modeling and quantifying this variability is essential for enhancing the robustness and clinical applicability of these algorithms. We report the set-up and summarize the benchmark results of the Quantification of Uncertainties in Biomedical Image Quantification Challenge (QUBIQ), which was organized in conjunction with International Conferences on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2020 and 2021. The challenge focuses on the uncertainty quantification of medical image segmentation which considers the omnipresence of inter-rater variability in imaging datasets. The large collection of images with multi-rater annotations features various modalities such as MRI and CT; various organs such as the brain, prostate, kidney, and pancreas; and different image dimensions 2D-vs-3D. A total of 24 teams submitted different solutions to the problem, combining various baseline models, Bayesian neural networks, and ensemble model techniques. The obtained results indicate the importance of the ensemble models, as well as the need for further research to develop efficient 3D methods for uncertainty quantification methods in 3D segmentation tasks.
翻译:医学图像分割任务中的不确定性,尤其是由不同专家在解读和标注上的差异引起的标注者间变异性,是实现一致且可靠的图像分割所面临的重大挑战。这种变异性不仅反映了医学图像解读固有的复杂性和主观性,而且直接影响自动化分割算法的开发与评估。准确建模并量化这种变异性对于提升算法的鲁棒性和临床适用性至关重要。本文报告了与2020年及2021年国际医学图像计算与计算机辅助介入会议(MICCAI)联合举办的生物医学图像量化不确定性挑战赛(QUBIQ)的设立情况并总结了基准测试结果。该挑战赛聚焦于医学图像分割的不确定性量化问题,重点关注成像数据集中普遍存在的标注者间变异性。大规模的多标注者标注图像集涵盖了多种模态(如MRI和CT)、多种器官(如脑、前列腺、肾脏和胰腺)以及不同的图像维度(2D与3D)。共有24支团队针对该问题提交了不同解决方案,综合运用了多种基线模型、贝叶斯神经网络以及集成模型技术。所得结果凸显了集成模型的重要性,并指出未来研究需进一步开发适用于3D分割任务的高效3D不确定性量化方法。