Developing a foundation model for high-resolution remote sensing data of the Netherlands

We develop a foundation model using 1.2m high resolution satellite images of the Netherlands. By combining a Convolutional Neural Network and a Vision Transformer, the model captures both low- and high-frequency landscape features, such as fine textures, edges, and small objects as well as large terrain structures, elevation patterns, and land-cover distributions. Leveraging temporal data as input, the model learns from broader contextual information across time, allowing the model to exploit the temporal dependencies, such as topographic features, land-cover changes, and seasonal dynamics. These additional constraints reduce feature ambiguity, improve representation learning, and enable better generalization with fewer labeled samples. The foundation model is evaluated on multiple downstream tasks, ranging from use cases within the Netherlands to global benchmarking datasets. On the vegetation monitoring dataset of the Netherlands, the model shows clear performance improvements by incorporating temporal information instead of relying on a single time point. Despite using a smaller model and less pretraining data limited to the Netherlands, it achieves competitive results on global benchmarks when compared to state-of-the-art models. These results demonstrate that the model can learn rich, generalizable representations from limited data, achieving competitive performance on global benchmarks while using a fraction of the parameters of larger state-of-the-art remote sensing models. To maximize reproducibility and reuse, we made the scripts and the model accessible on GitHub.

翻译：我们利用荷兰1.2米分辨率的高清卫星影像构建了一个基础模型。通过结合卷积神经网络与视觉Transformer，该模型能够同时捕捉低高频景观特征，包括精细纹理、边缘和小型物体，以及大型地形结构、高程模式和土地覆盖分布。利用时间序列数据作为输入，模型可从更广泛的跨时间上下文信息中学习，从而充分利用地形特征、土地覆盖变化和季节动态等时间依赖性。这些额外约束降低了特征模糊性，改进了表征学习，并能在更少标注样本的情况下实现更好的泛化能力。该基础模型在多个下游任务上进行了评估，涵盖荷兰境内应用场景与全球基准数据集。在荷兰植被监测数据集上，引入时间信息而非仅依赖单一时间点的模型展现出显著的性能提升。尽管采用了更小的模型规模且预训练数据仅局限于荷兰，该模型在全球基准测试中仍取得了与现有最先进模型相竞争的结果。这些结果表明，模型能够从有限数据中学习到丰富且可泛化的表征，在参数量仅为大型遥感模型极小比例的情况下，在全球基准上实现具有竞争力的性能。为最大程度确保可重复性与复用性，我们已将脚本和模型开源至GitHub。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

高阶网络的表示：基于图的框架综述

专知会员服务

16+阅读 · 5月14日

航天遥感大模型发展综述与产业化应用展望

专知会员服务

23+阅读 · 2025年6月26日

《遥感基础模型研究综述：从视觉到多模态的演进》

专知会员服务

18+阅读 · 2025年3月31日

大模型如何用于遥感？最新《用于遥感与地球观测的基础模型》综述

专知会员服务

34+阅读 · 2024年10月25日