Self-supervised learning (SSL) has become a powerful paradigm for learning from large, unlabeled datasets, particularly in computer vision (CV). However, applying SSL to multispectral remote sensing (RS) images presents unique challenges and opportunities due to the geographical and temporal variability of the data. In this paper, we introduce GeoRank, a novel regularization method for contrastive SSL that improves upon prior techniques by directly optimizing spherical distances to embed geographical relationships into the learned feature space. GeoRank outperforms or matches prior methods that integrate geographical metadata and consistently improves diverse contrastive SSL algorithms (e.g., BYOL, DINO). Beyond this, we present a systematic investigation of key adaptations of contrastive SSL for multispectral RS images, including the effectiveness of data augmentations, the impact of dataset cardinality and image size on performance, and the task dependency of temporal views. Code is available at https://github.com/tomburgert/georank.
翻译:自监督学习已成为从大规模无标签数据集中学习的强大范式,尤其在计算机视觉领域。然而,将自监督学习应用于多光谱遥感图像时,由于数据的地理和时间变异性,带来了独特的挑战与机遇。本文提出了GeoRank,一种用于对比自监督学习的新型正则化方法。该方法通过直接优化球面距离,将地理关系嵌入到学习到的特征空间中,从而改进了现有技术。GeoRank在性能上优于或匹配于先前整合地理元数据的方法,并能持续改进多种对比自监督学习算法(例如BYOL、DINO)。此外,我们对对比自监督学习在多光谱遥感图像上的关键适应性调整进行了系统研究,包括数据增强的有效性、数据集基数和图像尺寸对性能的影响,以及时间视图的任务依赖性。代码可在https://github.com/tomburgert/georank获取。