SAM-SP: Self-Prompting Makes SAM Great Again

The recently introduced Segment Anything Model (SAM), a Visual Foundation Model (VFM), has demonstrated impressive capabilities in zero-shot segmentation tasks across diverse natural image datasets. Despite its success, SAM encounters noticeably performance degradation when applied to specific domains, such as medical images. Current efforts to address this issue have involved fine-tuning strategies, intended to bolster the generalizability of the vanilla SAM. However, these approaches still predominantly necessitate the utilization of domain specific expert-level prompts during the evaluation phase, which severely constrains the model's practicality. To overcome this limitation, we introduce a novel self-prompting based fine-tuning approach, called SAM-SP, tailored for extending the vanilla SAM model. Specifically, SAM-SP leverages the output from the previous iteration of the model itself as prompts to guide subsequent iteration of the model. This self-prompting module endeavors to learn how to generate useful prompts autonomously and alleviates the dependence on expert prompts during the evaluation phase, significantly broadening SAM's applicability. Additionally, we integrate a self-distillation module to enhance the self-prompting process further. Extensive experiments across various domain specific datasets validate the effectiveness of the proposed SAM-SP. Our SAM-SP not only alleviates the reliance on expert prompts but also exhibits superior segmentation performance comparing to the state-of-the-art task-specific segmentation approaches, the vanilla SAM, and SAM-based approaches.

翻译：近期提出的视觉基础模型（VFM）——Segment Anything Model（SAM）——在多种自然图像数据集上的零样本分割任务中展现了令人印象深刻的能力。尽管取得了成功，SAM在应用于特定领域（如医学图像）时仍表现出明显的性能下降。当前解决这一问题的尝试多涉及微调策略，旨在增强原始SAM的泛化能力。然而，这些方法在评估阶段仍主要依赖于领域特定的专家级提示，这严重限制了模型的实际应用性。为克服这一局限，我们提出了一种新颖的基于自提示的微调方法，称为SAM-SP，专为扩展原始SAM模型而设计。具体而言，SAM-SP利用模型自身前一轮迭代的输出作为提示，以指导后续迭代过程。该自提示模块致力于学习如何自主生成有效提示，从而减轻评估阶段对专家提示的依赖，显著拓宽了SAM的适用性。此外，我们引入了一个自蒸馏模块以进一步增强自提示过程。在多个领域特定数据集上的广泛实验验证了所提SAM-SP的有效性。我们的SAM-SP不仅降低了对专家提示的依赖，而且在分割性能上优于当前最先进的特定任务分割方法、原始SAM以及基于SAM的各类方法。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日