In this work we pre-train a DINO-ViT based model using two Synthetic Aperture Radar datasets (S1GRD or GSSIC) across three regions (China, Conus, Europe). We fine-tune the models on smaller labeled datasets to predict vegetation percentage, and empirically study the connection between the embedding space of the models and their ability to generalize across diverse geographic regions and to unseen data. For S1GRD, embedding spaces of different regions are clearly separated, while GSSIC's overlaps. Positional patterns remain during fine-tuning, and greater distances in embeddings often result in higher errors for unfamiliar regions. With this, our work increases our understanding of generalizability for self-supervised models applied to remote sensing.
翻译:本文基于DINO-ViT架构,利用两个合成孔径雷达数据集(S1GRD或GSSIC)在三个区域(中国、美国本土、欧洲)进行预训练。我们在较小的标注数据集上微调模型以预测植被覆盖率,并实证研究模型嵌入空间与其跨不同地理区域及未见数据泛化能力之间的关联。对于S1GRD,不同区域的嵌入空间被明显分离,而GSSIC的嵌入空间则存在重叠。位置模式在微调过程中保持不变,且嵌入空间中较大的距离往往导致对陌生区域产生更高的误差。通过本研究,我们加深了对自监督模型在遥感领域泛化性的理解。