Leveraging vast training data (SA-1B), the foundation Segment Anything Model (SAM) proposed by Meta AI Research exhibits remarkable generalization and zero-shot capabilities. Nonetheless, as a category-agnostic instance segmentation method, SAM heavily depends on prior manual guidance involving points, boxes, and coarse-grained masks. Additionally, its performance on remote sensing image segmentation tasks has yet to be fully explored and demonstrated. In this paper, we consider designing an automated instance segmentation approach for remote sensing images based on the SAM foundation model, incorporating semantic category information. Inspired by prompt learning, we propose a method to learn the generation of appropriate prompts for SAM input. This enables SAM to produce semantically discernible segmentation results for remote sensing images, which we refer to as RSPrompter. We also suggest several ongoing derivatives for instance segmentation tasks, based on recent developments in the SAM community, and compare their performance with RSPrompter. Extensive experimental results on the WHU building, NWPU VHR-10, and SSDD datasets validate the efficacy of our proposed method. Our code is accessible at \url{https://kyanchen.github.io/RSPrompter}.
翻译:利用海量训练数据(SA-1B),Meta AI研究团队提出的基础分割一切模型(SAM)展现出卓越的泛化与零样本能力。然而,作为类别无关的实例分割方法,SAM高度依赖包含点、框和粗粒度掩码的先验人工引导。此外,其在遥感图像分割任务上的性能尚未得到充分探索和验证。本文基于SAM基础模型,设计了一种面向遥感图像的自动化实例分割方法,并融入了语义类别信息。受提示学习启发,我们提出学习为SAM输入生成合适提示的方法,从而使其能够为遥感图像输出具有语义可辨识性的分割结果,将该方法命名为RSPrompter。我们还基于SAM社区的最新进展,提出了几种面向实例分割任务的衍生方法,并与RSPrompter进行了性能对比。在WHU建筑、NWPU VHR-10和SSDD数据集上的大量实验结果验证了所提方法的有效性。我们的代码可通过\url{https://kyanchen.github.io/RSPrompter}获取。