Recent advances in segmentation foundation models have enabled accurate and efficient segmentation across a wide range of natural images and videos, but their utility to medical data remains unclear. In this work, we first present a comprehensive benchmarking of the Segment Anything Model 2 (SAM2) across 11 medical image modalities and videos and point out its strengths and weaknesses by comparing it to SAM1 and MedSAM. Then, we develop a transfer learning pipeline and demonstrate SAM2 can be quickly adapted to medical domain by fine-tuning. Furthermore, we implement SAM2 as a 3D slicer plugin and Gradio API for efficient 3D image and video segmentation. The code has been made publicly available at \url{https://github.com/bowang-lab/MedSAM}.
翻译:近年来,分割基础模型的发展使得在广泛的自然图像和视频中实现准确高效的分割成为可能,但其在医学数据上的实用性仍不明确。在本工作中,我们首先对Segment Anything Model 2 (SAM2)在11种医学图像模态及视频上进行了全面的基准测试,并通过与SAM1和MedSAM的比较指出了其优势与不足。随后,我们开发了一个迁移学习流程,并证明通过微调可以快速将SAM2适配到医学领域。此外,我们将SAM2实现为3D Slicer插件和Gradio API,以实现高效的3D图像与视频分割。代码已在\url{https://github.com/bowang-lab/MedSAM}公开提供。