FlexiMo: A Flexible Remote Sensing Foundation Model

The rapid expansion of multi-source satellite imagery drives innovation in Earth observation, opening unprecedented opportunities for Remote Sensing Foundation Models to harness diverse data. However, many existing models remain constrained by fixed spatial resolutions and patch sizes, limiting their ability to fully exploit the heterogeneous spatial characteristics inherent in satellite imagery. To address these challenges, we propose FlexiMo, a flexible remote sensing foundation model that endows the pre-trained model with the flexibility to adapt to arbitrary spatial resolutions. Central to FlexiMo is a spatial resolution-aware module that employs a parameter-free alignment embedding mechanism to dynamically recalibrate patch embeddings based on the input image's resolution and dimensions. This design not only preserves critical token characteristics and ensures multi-scale feature fidelity but also enables efficient feature extraction without requiring modifications to the underlying network architecture. In addition, FlexiMo incorporates a lightweight channel adaptation module that leverages prior spectral information from sensors. This mechanism allows the model to process images with varying numbers of channels while maintaining the data's intrinsic physical properties. Extensive experiments on diverse multimodal, multi-resolution, and multi-scale datasets demonstrate that FlexiMo significantly enhances model generalization and robustness. In particular, our method achieves outstanding performance across a range of downstream tasks, including scene classification, land cover classification, urban building segmentation, and cloud detection. By enabling parameter-efficient and physically consistent adaptation, FlexiMo paves the way for more adaptable and effective foundation models in real-world remote sensing applications.

翻译：多源卫星影像的快速扩展推动了地球观测领域的创新，为遥感基础模型利用多样化数据带来了前所未有的机遇。然而，现有许多模型仍受限于固定的空间分辨率和图像块尺寸，难以充分利用卫星影像固有的异质性空间特征。为应对这些挑战，本文提出FlexiMo，一种灵活的遥感基础模型，使预训练模型能够适应任意空间分辨率。FlexiMo的核心是一个空间分辨率感知模块，该模块采用无参数对齐嵌入机制，根据输入图像的分辨率和尺寸动态重校准图像块嵌入。这一设计不仅保留了关键令牌特征并确保多尺度特征保真度，还能在不改变底层网络架构的情况下实现高效特征提取。此外，FlexiMo还集成了一个轻量级通道适配模块，该模块利用传感器先验光谱信息，使模型能够处理不同通道数量的图像，同时保持数据固有的物理特性。在多模态、多分辨率及多尺度数据集上的大量实验表明，FlexiMo显著提升了模型的泛化能力和鲁棒性。特别地，本方法在一系列下游任务中均取得优异性能，包括场景分类、土地覆盖分类、城市建筑物分割和云检测。通过实现参数高效且物理一致的适配，FlexiMo为实际遥感应用中更具适应性和有效性的基础模型开辟了新途径。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日