Leveraging the Segment Anything Model (SAM) for medical image segmentation remains challenging due to its limited adaptability across diverse medical domains. Although fine-tuned variants, such as MedSAM, improve performance in scenarios similar to the training modalities or organs, they may lack generalizability to unseen data. To overcome this limitation, we propose SAM-aware Test-time Adaptation (SAM-TTA), a lightweight and flexible framework that preserves SAM's inherent generalization ability while enhancing segmentation accuracy for medical images. SAM-TTA tackles two major challenges: (1) input-level discrepancy caused by channel mismatches between natural and medical images, and (2) semantic-level discrepancy due to different object characteristics in natural versus medical images (e.g., with clear boundaries vs. ambiguous structures). To this end, we introduce two complementary components: a self-adaptive Bezier Curve-based Transformation (SBCT), which maps single-channel medical images into SAM-compatible three-channel images via a few learnable parameters to be optimized at test time; and IoU-guided Multi-scale Adaptation (IMA), which leverages SAM's intrinsic IoU scores to enforce high output confidence, dual-scale prediction consistency, and intermediate feature consistency, to improve semantic-level alignments. Extensive experiments on eight public medical image segmentation tasks, covering six grayscale and two color (endoscopic) tasks, demonstrate that SAM-TTA consistently outperforms state-of-the-art test-time adaptation methods. Notably, on six grayscale datasets, SAM-TTA even surpasses fully fine-tuned models, achieving significant Dice improvements (i.e., average 4.8% and 7.4% gains over MedSAM and SAM-Med2D) and establishing a new paradigm for universal medical image segmentation. Code is available at https://github.com/JianghaoWu/SAM-TTA.
翻译:利用Segment Anything Model (SAM)进行医学图像分割仍面临挑战,主要因其在不同医学领域的适应性有限。尽管经过微调的变体(如MedSAM)在训练模态或器官相似的场景中提升了性能,但其对未见数据的泛化能力可能不足。为克服这一局限,我们提出SAM感知的测试时自适应(SAM-TTA),这是一种轻量级且灵活的框架,在保持SAM固有泛化能力的同时,提升了医学图像的分割精度。SAM-TTA解决了两个主要挑战:(1) 自然图像与医学图像间通道不匹配导致的输入级差异;(2) 自然图像与医学图像中目标特征不同(例如清晰边界与模糊结构)导致的语义级差异。为此,我们引入了两个互补组件:基于自适应贝塞尔曲线的变换(SBCT),其通过少量可在测试时优化的可学习参数,将单通道医学图像映射为SAM兼容的三通道图像;以及IoU引导的多尺度自适应(IMA),其利用SAM固有的IoU分数来强制实现高输出置信度、双尺度预测一致性和中间特征一致性,以改善语义级对齐。在八个公共医学图像分割任务(涵盖六项灰度任务和两项彩色内镜任务)上的大量实验表明,SAM-TTA始终优于最先进的测试时自适应方法。值得注意的是,在六个灰度数据集上,SAM-TTA甚至超越了完全微调的模型,实现了显著的Dice提升(即平均比MedSAM和SAM-Med2D分别高出4.8%和7.4%),为通用医学图像分割建立了新范式。代码发布于https://github.com/JianghaoWu/SAM-TTA。