Leveraging the extensive training data from SA-1B, the Segment Anything Model (SAM) demonstrates remarkable generalization and zero-shot capabilities. However, as a category-agnostic instance segmentation method, SAM heavily relies on prior manual guidance, including points, boxes, and coarse-grained masks. Furthermore, its performance in remote sensing image segmentation tasks remains largely unexplored and unproven. In this paper, we aim to develop an automated instance segmentation approach for remote sensing images, based on the foundational SAM model and incorporating semantic category information. Drawing inspiration from prompt learning, we propose a method to learn the generation of appropriate prompts for SAM. This enables SAM to produce semantically discernible segmentation results for remote sensing images, a concept we have termed RSPrompter. We also propose several ongoing derivatives for instance segmentation tasks, drawing on recent advancements within the SAM community, and compare their performance with RSPrompter. Extensive experimental results, derived from the WHU building, NWPU VHR-10, and SSDD datasets, validate the effectiveness of our proposed method. The code for our method is publicly available at kychen.me/RSPrompter.
翻译:基于SA-1B大规模训练数据,Segment Anything Model(SAM)展现出卓越的泛化能力和零样本学习性能。然而,作为类别无关的实例分割方法,SAM严重依赖于包括点、框和粗粒度掩码在内的先验人工引导。此外,其在遥感影像分割任务中的性能尚待深入探索与验证。本文旨在基于基础模型SAM,融合语义类别信息,开发面向遥感影像的自动化实例分割方法。受提示学习启发,我们提出了一种为SAM生成适当提示的学习方法,使SAM能够输出具有语义可分辨性的遥感影像分割结果,该方法被命名为RSPrompter。同时,基于SAM社区的最新进展,我们提出若干适用于实例分割任务的衍生方法,并与RSPrompter进行性能对比。在WHU建筑、NWPU VHR-10和SSDD数据集上的大量实验结果验证了所提方法的有效性。方法代码已在kychen.me/RSPrompter公开。