Foundation models, such as OpenAI's GPT-3 and GPT-4, Meta's LLaMA, and Google's PaLM2, have revolutionized the field of artificial intelligence. A notable paradigm shift has been the advent of the Segment Anything Model (SAM), which has exhibited a remarkable capability to segment real-world objects, trained on 1 billion masks and 11 million images. Although SAM excels in general object segmentation, it lacks the intrinsic ability to detect salient objects, resulting in suboptimal performance in this domain. To address this challenge, we present the Segment Salient Object Model (SSOM), an innovative approach that adaptively fine-tunes SAM for salient object detection by harnessing the low-rank structure inherent in deep learning. Comprehensive qualitative and quantitative evaluations across five challenging RGB benchmark datasets demonstrate the superior performance of our approach, surpassing state-of-the-art methods.
翻译:基础模型,如OpenAI的GPT-3和GPT-4、Meta的LLaMA以及Google的PaLM2,已经彻底改变了人工智能领域。一个显著的范式转变是任意分割模型(SAM)的出现,该模型在10亿掩码和1100万张图像上训练,展现了分割真实世界物体的卓越能力。尽管SAM在通用物体分割方面表现出色,但它缺乏检测显著目标的固有能力,导致在此领域的表现欠佳。为应对这一挑战,我们提出了显著目标分割模型(SSOM),这是一种创新方法,通过利用深度学习中的低秩结构对SAM进行自适应微调,以实现显著目标检测。在五个具有挑战性的RGB基准数据集上的全面定性和定量评估表明,我们的方法性能优越,超越了当前最先进的方法。