Segmentation models such as Segment Anything Model (SAM) and SAM2 achieve strong prompt-driven zero-shot performance. However, their training on natural images limits domain transfer to medical data. Consequently, accurate segmentation typically requires extensive fine-tuning and expert-designed prompts. We propose DiffuSAM, a diffusion-based adaptation of SAM2 for prompt-free medical image segmentation. Our framework synthesizes SAM2-compatible segmentation mask-like embeddings via a lightweight diffusion-prior from off-the-shelf frozen SAM2 image features. The generated embeddings are integrated into SAM2's mask decoder to produce accurate segmentations, thereby eliminating the need for user prompts. The diffusion prior is further conditioned on previously segmented slices, enforcing spatial consistency across volumes. Evaluated on the BTCV and CHAOS datasets for CT and MRI under Source-Free Unsupervised Domain Adaptation (SF-UDA) and Few-Shot settings, DiffuSAM achieves competitive performance with efficient training and inference. Code is available upon request from the corresponding author.
翻译:诸如Segment Anything Model (SAM)和SAM2等分割模型在提示驱动下展现出强大的零样本性能。然而,它们在自然图像上的训练限制了向医学数据的领域迁移。因此,精准分割通常需要大量微调和专家设计的提示。我们提出DiffuSAM,这是一种基于扩散的SAM2自适应方法,用于无提示的医学图像分割。我们的框架通过从现成的冻结SAM2图像特征中提取的轻量级扩散先验,合成兼容SAM2的分割掩膜类嵌入。生成的嵌入被整合到SAM2的掩膜解码器中,以生成精准分割,从而消除了用户提示的需求。扩散先验进一步基于先前分割的切片进行条件化,确保跨体素的空间一致性。在无源无监督域自适应(SF-UDA)和少样本设置下,针对CT和MRI分别在BTCV和CHAOS数据集上评估,DiffuSAM通过高效的训练和推理实现了有竞争力的性能。代码可向通讯作者索取。