Remote Sensing Vision-Language Models (RSVLMs) have shown remarkable potential thanks to large-scale pretraining, achieving strong zero-shot performance on various tasks. However, their ability to generalize in low-data regimes, such as few-shot learning, remains insufficiently explored. In this work, we present the first structured benchmark for evaluating few-shot adaptation methods on RSVLMs. We conduct comprehensive experiments across ten remote sensing scene classification datasets, applying five widely used few-shot adaptation strategies to three state-of-the-art RSVLMs with varying backbones. Our findings reveal that models with similar zero-shot performance can exhibit markedly different behavior under few-shot adaptation, with some RSVLMs being inherently more amenable to such adaptation than others. The variability of performance and the absence of a clear winner among existing methods highlight the need for the development of more robust methods for few-shot adaptation tailored to RS. To facilitate future research, we provide a reproducible benchmarking framework and open-source code to systematically evaluate RSVLMs under few-shot conditions. The source code is publicly available on Github: https://github.com/elkhouryk/fewshot_RSVLMs
翻译:遥感视觉语言模型(RSVLMs)通过大规模预训练展现出显著潜力,在各种任务上实现了强大的零样本性能。然而,它们在低数据环境(如少样本学习)中的泛化能力仍未得到充分探索。本研究首次提出了一个结构化基准,用于评估RSVLMs的少样本适应方法。我们在十个遥感场景分类数据集上进行了全面实验,将五种广泛使用的少样本适应策略应用于三种具有不同骨干网络的最先进RSVLMs。研究结果表明,具有相似零样本性能的模型在少样本适应条件下可能表现出显著不同的行为,其中某些RSVLMs本质上比其他模型更适合此类适应。性能的差异性以及现有方法中缺乏明确优胜者的现状,凸显了需要为遥感领域开发更鲁棒的少样本适应方法。为促进未来研究,我们提供了一个可复现的基准测试框架和开源代码,以系统评估少样本条件下的RSVLMs。源代码已在Github上公开:https://github.com/elkhouryk/fewshot_RSVLMs