Shortcut learning is a phenomenon where machine learning models prioritize learning simple, potentially misleading cues from data that do not generalize well beyond the training set. While existing research primarily investigates this in the realm of image classification, this study extends the exploration of shortcut learning into medical image segmentation. We demonstrate that clinical annotations such as calipers, and the combination of zero-padded convolutions and center-cropped training sets in the dataset can inadvertently serve as shortcuts, impacting segmentation accuracy. We identify and evaluate the shortcut learning on two different but common medical image segmentation tasks. In addition, we suggest strategies to mitigate the influence of shortcut learning and improve the generalizability of the segmentation models. By uncovering the presence and implications of shortcuts in medical image segmentation, we provide insights and methodologies for evaluating and overcoming this pervasive challenge and call for attention in the community for shortcuts in segmentation.
翻译:捷径学习是一种机器学习现象,即模型倾向于优先学习数据中简单但可能具有误导性的线索,这些线索在训练集之外难以泛化。现有研究主要关注图像分类领域的捷径学习,而本研究将其拓展至医学图像分割领域。我们证明,临床标注(如卡尺标记)以及数据集中的零填充卷积与中心裁剪训练集组合,可能无意中成为影响分割准确性的捷径。我们在两种不同但常见的医学图像分割任务中识别并评估了捷径学习现象。此外,我们提出了缓解捷径学习影响、提升分割模型泛化能力的策略。通过揭示医学图像分割中捷径的存在与影响,我们为评估和克服这一普遍挑战提供了见解和方法论,并呼吁学界关注分割任务中的捷径问题。