The Segmentation Anything Model (SAM) has recently emerged as a foundation model for addressing image segmentation. Owing to the intrinsic complexity of medical images and the high annotation cost, the medical image segmentation (MIS) community has been encouraged to investigate SAM's zero-shot capabilities to facilitate automatic annotation. Inspired by the extraordinary accomplishments of interactive medical image segmentation (IMIS) paradigm, this paper focuses on assessing the potential of SAM's zero-shot capabilities within the IMIS paradigm to amplify its benefits in the MIS domain. Regrettably, we observe that SAM's vulnerability to prompt forms (e.g., points, bounding boxes) becomes notably pronounced in IMIS. This leads us to develop a framework that adaptively offers suitable prompt forms for human experts. We refer to the framework above as temporally-extended prompts optimization (TEPO) and model it as a Markov decision process, solvable through reinforcement learning. Numerical experiments on the standardized benchmark BraTS2020 demonstrate that the learned TEPO agent can further enhance SAM's zero-shot capability in the MIS context.
翻译:分割一切模型(SAM)近期作为解决图像分割的基础模型崭露头角。由于医学图像固有的复杂性和高昂的标注成本,医学图像分割领域受到激励,探索SAM的零样本能力以促进自动标注。受交互式医学图像分割范式卓越成就的启发,本文聚焦于评估SAM零样本潜力在该范式中的应用,以增强其在医学图像分割领域的优势。遗憾的是,我们观察到在交互式医学图像分割中,SAM对提示形式(如点、边界框)的敏感性显著增强。这促使我们开发一个框架,自适应地为人类专家提供合适的提示形式。我们将上述框架称为时间扩展提示优化,并将其建模为可通强化学习求解的马尔可夫决策过程。基于标准化基准BraTS2020的数值实验表明,学习到的TEPO智能体可进一步提升SAM在医学图像分割场景中的零样本能力。