CrossEarth: Geospatial Vision Foundation Model for Domain Generalizable Remote Sensing Semantic Segmentation

Ziyang Gong,Zhixiang Wei,Di Wang,Xianzheng Ma,Hongruixuan Chen,Yuru Jia,Yupeng Deng,Zhenming Ji,Xiangwei Zhu,Naoto Yokoya,Jing Zhang,Bo Du,Liangpei Zhang

from arxiv, The codes and models will be available at https://github.com/Cuzyoung/CrossEarth

The field of Remote Sensing Domain Generalization (RSDG) has emerged as a critical and valuable research frontier, focusing on developing models that generalize effectively across diverse scenarios. Despite the substantial domain gaps in RS images that are characterized by variabilities such as location, wavelength, and sensor type, research in this area remains underexplored: (1) Current cross-domain methods primarily focus on Domain Adaptation (DA), which adapts models to predefined domains rather than to unseen ones; (2) Few studies targeting the RSDG issue, especially for semantic segmentation tasks, where existing models are developed for specific unknown domains, struggling with issues of underfitting on other unknown scenarios; (3) Existing RS foundation models tend to prioritize in-domain performance over cross-domain generalization. To this end, we introduce the first vision foundation model for RSDG semantic segmentation, CrossEarth. CrossEarth demonstrates strong cross-domain generalization through a specially designed data-level Earth-Style Injection pipeline and a model-level Multi-Task Training pipeline. In addition, for the semantic segmentation task, we have curated an RSDG benchmark comprising 28 cross-domain settings across various regions, spectral bands, platforms, and climates, providing a comprehensive framework for testing the generalizability of future RSDG models. Extensive experiments on this benchmark demonstrate the superiority of CrossEarth over existing state-of-the-art methods.

翻译：遥感领域泛化（RSDG）已成为一个关键且富有价值的研究前沿，其核心在于开发能够有效适应多样化场景的模型。尽管遥感图像存在显著领域差异，这些差异体现在地理位置、波长和传感器类型等多变因素上，但该领域的研究仍显不足：（1）当前跨领域方法主要集中于领域自适应（DA），其旨在将模型适配于预定义领域，而非未见领域；（2）针对RSDG问题的研究，尤其是在语义分割任务上，现有模型多为特定未知领域设计，难以适应其他未知场景，常面临欠拟合问题；（3）现有的遥感基础模型往往优先考虑域内性能，而忽视了跨领域泛化能力。为此，我们提出了首个面向RSDG语义分割的视觉基础模型——CrossEarth。CrossEarth通过专门设计的数据层面“地球风格注入”流程和模型层面“多任务训练”流程，展现出强大的跨领域泛化能力。此外，针对语义分割任务，我们构建了一个RSDG基准测试集，涵盖不同区域、光谱波段、平台和气候下的28种跨领域设置，为未来RSDG模型的泛化能力测试提供了一个全面框架。在该基准上的大量实验表明，CrossEarth优于现有的最先进方法。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

18+阅读 · 2019年11月23日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日