Data-driven deep learning methods have shown great potential in cropland mapping. However, due to multiple factors such as attributes of cropland (topography, climate, crop type) and imaging conditions (viewing angle, illumination, scale), croplands under different scenes demonstrate a great domain gap. This makes it difficult for models trained in the specific scenes to directly generalize to other scenes. A common way to handle this problem is through the "Pretrain+Fine-tuning" paradigm. Unfortunately, considering the variety of features of cropland that are affected by multiple factors, it is hardly to handle the complex domain gap between pre-trained data and target data using only sparse fine-tuned samples as general constraints. Moreover, as the number of model parameters grows, fine-tuning is no longer an easy and low-cost task. With the emergence of prompt learning via visual foundation models, the "Pretrain+Prompting" paradigm redesigns the optimization target by introducing individual prompts for each single sample. This simplifies the domain adaption from generic to specific scenes during model reasoning processes. Therefore, we introduce the "Pretrain+Prompting" paradigm to interpreting cropland scenes and design the auto-prompting (APT) method based on freely available global land cover product. It can achieve a fine-grained adaptation process from generic scenes to specialized cropland scenes without introducing additional label costs. To our best knowledge, this work pioneers the exploration of the domain adaption problems for cropland mapping under prompt learning perspectives. Our experiments using two sub-meter cropland datasets from southern and northern China demonstrated that the proposed method via visual foundation models outperforms traditional supervised learning and fine-tuning approaches in the field of remote sensing.
翻译:数据驱动的深度学习方法在农田制图中展现出巨大潜力。然而,受农田属性(地形、气候、作物类型)及成像条件(视角、光照、尺度)等多重因素影响,不同场景下的农田存在显著域差异。这导致在特定场景训练的模型难以直接泛化至其他场景。常规解决方案采用"预训练+微调"范式,但鉴于受多重因素影响的农田特征多样性,仅以稀疏微调样本作为通用约束难以应对预训练数据与目标数据间的复杂域差异。随着模型参数规模增长,微调已非便捷低成本的解决方案。基于视觉基础模型的提示学习的出现,促使"预训练+提示"范式通过为每个样本引入独立提示重构优化目标,简化模型推理过程中从通用到特定场景的域适应过程。为此,我们将"预训练+提示"范式引入农田场景解译,并基于公开全球土地覆盖产品设计自动提示(APT)方法。该方法可在不增加额外标注成本的前提下,实现从通用场景到专业化农田场景的细粒度适应过程。据我们所知,本研究首次从提示学习角度探索农田制图的域适应问题。采用中国南方与北方两个亚米级农田数据集的实验表明,基于视觉基础模型的所提方法在遥感领域优于传统监督学习与微调方法。