Segmentation in medical imaging plays a crucial role in diagnosing, monitoring, and treating various diseases and conditions. The current landscape of segmentation in the medical domain is dominated by numerous specialized deep learning models fine-tuned for each segmentation task and image modality. Recently, the Segment Anything Model (SAM), a new segmentation model, was introduced. SAM utilizes the ViT neural architecture and leverages a vast training dataset to segment almost any object. However, its generalizability to the medical domain remains unexplored. In this study, we assess the zero-shot capabilities of SAM 2D in medical imaging using eight different prompt strategies across six datasets from four imaging modalities: X-ray, ultrasound, dermatoscopy, and colonoscopy. Our results demonstrate that SAM's zero-shot performance is comparable and, in certain cases, superior to the current state-of-the-art. Based on our findings, we propose a practical guideline that requires minimal interaction and yields robust results in all evaluated contexts.
翻译:医学影像分割在诊断、监测及治疗各类疾病中扮演关键角色。当前医学领域的分割研究主要依赖于针对特定分割任务和影像模态进行微调的专业深度学习模型。近期发布的Segment Anything模型(SAM)采用ViT神经架构,并利用大规模训练数据集实现近乎通用的目标分割能力。然而,该模型在医学领域的泛化性能尚未得到充分探索。本研究基于八种不同提示策略,在涵盖X光、超声、皮肤镜及结肠镜等四种影像模态的六个数据集上,系统评估了SAM的二维零样本分割能力。结果表明,SAM的零样本性能与当前最优方法相当,在某些场景下甚至更优。基于研究发现,我们提出了一套实用指南,在保证最小交互需求的同时,在所有评估场景中均展现出稳健性能。