Segment Anything Model for Medical Images?

Yuhao Huang,Xin Yang,Lian Liu,Han Zhou,Ao Chang,Xinrui Zhou,Rusi Chen,Junxuan Yu,Jiongquan Chen,Chaoyu Chen,Haozhe Chi,Xindi Hu,Deng-Ping Fan,Fajin Dong,Dong Ni

from arxiv, 23 pages, 14 figures, 12 tables

The Segment Anything Model (SAM) is the first foundation model for general image segmentation. It designed a novel promotable segmentation task, ensuring zero-shot image segmentation using the pre-trained model via two main modes including automatic everything and manual prompt. SAM has achieved impressive results on various natural image segmentation tasks. However, medical image segmentation (MIS) is more challenging due to the complex modalities, fine anatomical structures, uncertain and complex object boundaries, and wide-range object scales. SAM has achieved impressive results on various natural image segmentation tasks. Meanwhile, zero-shot and efficient MIS can well reduce the annotation time and boost the development of medical image analysis. Hence, SAM seems to be a potential tool and its performance on large medical datasets should be further validated. We collected and sorted 52 open-source datasets, and build a large medical segmentation dataset with 16 modalities, 68 objects, and 553K slices. We conducted a comprehensive analysis of different SAM testing strategies on the so-called COSMOS 553K dataset. Extensive experiments validate that SAM performs better with manual hints like points and boxes for object perception in medical images, leading to better performance in prompt mode compared to everything mode. Additionally, SAM shows remarkable performance in some specific objects and modalities, but is imperfect or even totally fails in other situations. Finally, we analyze the influence of different factors (e.g., the Fourier-based boundary complexity and size of the segmented objects) on SAM's segmentation performance. Extensive experiments validate that SAM's zero-shot segmentation capability is not sufficient to ensure its direct application to the MIS.

翻译：“分割一切”模型（SAM）是首个用于通用图像分割的基础模型。它设计了一种新颖的可提示分割任务，通过自动分割一切和手动提示两种主要模式，利用预训练模型实现零样本图像分割。SAM已在多种自然图像分割任务中展现出令人瞩目的成果。然而，由于医学图像分割（MIS）涉及复杂模态、精细解剖结构、不确定且复杂的物体边界以及大跨度的物体尺度，其挑战性更大。SAM已在多种自然图像分割任务中取得显著效果。同时，零样本和高效MIS能够有效减少标注时间，推动医学图像分析的发展。因此，SAM似乎是一个潜在的工具，其在大规模医学数据集上的表现有待进一步验证。我们收集整理了52个开源数据集，构建了一个包含16种模态、68种物体和55.3万张切片的大型医学分割数据集——COSMOS 553K。我们在该数据集上对不同SAM测试策略进行了全面分析。大量实验验证，SAM在医学图像中物体感知方面，结合点或框等手动提示效果更佳，提示模式表现优于全局分割模式。此外，SAM在特定物体和模态下表现卓越，但在其他场景中则存在缺陷甚至完全失效。最后，我们分析了不同因素（如基于傅里叶变换的边界复杂度和物体尺寸）对SAM分割性能的影响。大量实验证实，SAM的零样本分割能力不足以确保其直接应用于MIS领域。