The Segment Anything Model (SAM) and similar models build a family of promptable foundation models (FMs) for image and video segmentation. The object of interest is identified using prompts, such as bounding boxes or points. With these FMs becoming part of medical image segmentation, extensive evaluation studies are required to assess their strengths and weaknesses in clinical setting. Since the performance is highly dependent on the chosen prompting strategy, it is important to investigate different prompting techniques to define optimal guidelines that ensure effective use in medical image segmentation. Currently, no dedicated evaluation studies exist specifically for bone segmentation in CT scans, leaving a gap in understanding the performance for this task. Thus, we use non-iterative, ``optimal'' prompting strategies composed of bounding box, points and combinations to test the zero-shot capability of SAM-family models for bone CT segmentation on three different skeletal regions. Our results show that the best settings depend on the model type and size, dataset characteristics and objective to optimize. Overall, SAM and SAM2 prompted with a bounding box in combination with the center point for all the components of an object yield the best results across all tested settings. As the results depend on multiple factors, we provide a guideline for informed decision-making in 2D prompting with non-interactive, ''optimal'' prompts.
翻译:Segment Anything Model(SAM)及类似模型构建了一个可提示的基础模型家族,用于图像与视频分割。目标对象通过提示(如边界框或点)进行识别。随着这些基础模型成为医学图像分割的一部分,需要开展广泛的评估研究以衡量其在临床环境中的优势与不足。由于模型性能高度依赖于所选的提示策略,研究不同的提示技术以制定确保医学图像分割有效使用的最优指导原则至关重要。目前,专门针对CT扫描骨分割的评估研究尚属空白,导致对该任务性能的理解存在不足。因此,我们采用由边界框、点及其组合构成的非迭代式“最优”提示策略,在三个不同骨骼区域上测试SAM系列模型对骨CT分割的零样本能力。结果表明,最佳设置取决于模型类型与规模、数据集特性以及待优化的目标。总体而言,采用边界框结合目标所有组成部分中心点的提示方式,SAM和SAM2在所有测试设置中均取得最佳结果。鉴于结果受多重因素影响,我们为使用非交互式“最优”提示进行二维提示决策提供了指导原则。