Recent progress in self-supervision has shown that pre-training large neural networks on vast amounts of unsupervised data can lead to substantial increases in generalization to downstream tasks. Such models, recently coined foundation models, have been transformational to the field of natural language processing. Variants have also been proposed for image data, but their applicability to remote sensing tasks is limited. To stimulate the development of foundation models for Earth monitoring, we propose a benchmark comprised of six classification and six segmentation tasks, which were carefully curated and adapted to be both relevant to the field and well-suited for model evaluation. We accompany this benchmark with a robust methodology for evaluating models and reporting aggregated results to enable a reliable assessment of progress. Finally, we report results for 20 baselines to gain information about the performance of existing models. We believe that this benchmark will be a driver of progress across a variety of Earth monitoring tasks.
翻译:近期自监督学习的进展表明,在大规模无监督数据上预训练大型神经网络可显著提升对下游任务的泛化能力。这类近期被定义为“基础模型”的架构,已在自然语言处理领域带来革命性突破。尽管针对图像数据也提出了相应变体,但其在遥感任务中的适用性仍十分有限。为促进地球监测领域基础模型的发展,我们提出了一个由六个分类任务与六个分割任务组成的基准测试集。这些任务经过精心筛选与适配,既保持领域相关性,又适合模型评估。我们为该基准配备了一套稳健的评估方法,用于模型评价与聚合结果报告,以确保对进展进行可靠评估。最终,我们报告了20个基线模型的结果,以揭示现有模型的性能表现。我们相信,该基准将推动各类地球监测任务的进步。