The field of Remote Sensing Domain Generalization (RSDG) has emerged as a critical and valuable research frontier, focusing on developing models that generalize effectively across diverse scenarios. Despite the substantial domain gaps in RS images that are characterized by variabilities such as location, wavelength, and sensor type, research in this area remains underexplored: (1) Current cross-domain methods primarily focus on Domain Adaptation (DA), which adapts models to predefined domains rather than to unseen ones; (2) Few studies targeting the RSDG issue, especially for semantic segmentation tasks, where existing models are developed for specific unknown domains, struggling with issues of underfitting on other unknown scenarios; (3) Existing RS foundation models tend to prioritize in-domain performance over cross-domain generalization. To this end, we introduce the first vision foundation model for RSDG semantic segmentation, CrossEarth. CrossEarth demonstrates strong cross-domain generalization through a specially designed data-level Earth-Style Injection pipeline and a model-level Multi-Task Training pipeline. In addition, for the semantic segmentation task, we have curated an RSDG benchmark comprising 28 cross-domain settings across various regions, spectral bands, platforms, and climates, providing a comprehensive framework for testing the generalizability of future RSDG models. Extensive experiments on this benchmark demonstrate the superiority of CrossEarth over existing state-of-the-art methods.
翻译:遥感领域泛化(RSDG)已成为一个关键且富有价值的研究前沿,其核心在于开发能够有效适应多样化场景的模型。尽管遥感图像存在显著领域差异,这些差异体现在地理位置、波长和传感器类型等多变因素上,但该领域的研究仍显不足:(1)当前跨领域方法主要集中于领域自适应(DA),其旨在将模型适配于预定义领域,而非未见领域;(2)针对RSDG问题的研究,尤其是在语义分割任务上,现有模型多为特定未知领域设计,难以适应其他未知场景,常面临欠拟合问题;(3)现有的遥感基础模型往往优先考虑域内性能,而忽视了跨领域泛化能力。为此,我们提出了首个面向RSDG语义分割的视觉基础模型——CrossEarth。CrossEarth通过专门设计的数据层面“地球风格注入”流程和模型层面“多任务训练”流程,展现出强大的跨领域泛化能力。此外,针对语义分割任务,我们构建了一个RSDG基准测试集,涵盖不同区域、光谱波段、平台和气候下的28种跨领域设置,为未来RSDG模型的泛化能力测试提供了一个全面框架。在该基准上的大量实验表明,CrossEarth优于现有的最先进方法。