Traditional supervised drone-view geo-localization (DVGL) methods heavily depend on paired training data and encounter difficulties in learning cross-view correlations from unpaired data. Moreover, when deployed in a new domain, these methods require obtaining the new paired data and subsequent retraining for model adaptation, which significantly increases computational overhead. Existing unsupervised methods have enabled to generate pseudo-labels based on cross-view similarity to infer the pairing relationships. However, geographical similarity and spatial continuity often cause visually analogous features at different geographical locations. The feature confusion compromises the reliability of pseudo-label generation, where incorrect pseudo-labels drive negative optimization. Given these challenges inherent in both supervised and unsupervised DVGL methods, we propose a novel cross-domain invariant knowledge transfer network (CDIKTNet) with limited supervision, whose architecture consists of a cross-domain invariance sub-network (CDIS) and a cross-domain transfer sub-network (CDTS). This architecture facilitates a closed-loop framework for invariance feature learning and knowledge transfer. The CDIS is designed to learn cross-view structural and spatial invariance from a small amount of paired data that serves as prior knowledge. It endows the shared feature space of unpaired data with similar implicit cross-view correlations at initialization, which alleviates feature confusion. Based on this, the CDTS employs dual-path contrastive learning to further optimize each subspace while preserving consistency in a shared feature space. Extensive experiments demonstrate that CDIKTNet achieves state-of-the-art performance under full supervision compared with those supervised methods, and further surpasses existing unsupervised methods in both few-shot and cross-domain initialization.
翻译:传统有监督的无人机视角地理定位(DVGL)方法严重依赖于成对的训练数据,难以从非成对数据中学习跨视角相关性。此外,当部署到新领域时,这些方法需要获取新的成对数据并进行后续重训练以实现模型适配,这显著增加了计算开销。现有的无监督方法已能够基于跨视角相似度生成伪标签来推断配对关系。然而,地理相似性与空间连续性往往导致不同地理位置出现视觉上相似的特征。这种特征混淆削弱了伪标签生成的可靠性,错误的伪标签将驱动负向优化。针对有监督与无监督DVGL方法中存在的这些固有挑战,我们提出了一种具有有限监督能力的跨域不变知识迁移网络(CDIKTNet),其架构由跨域不变性子网络(CDIS)和跨域迁移子网络(CDTS)组成。该架构构建了不变性特征学习与知识迁移的闭环框架。CDIS旨在从少量作为先验知识的成对数据中学习跨视角结构与空间不变性,从而在初始化阶段赋予非成对数据的共享特征空间相似的隐式跨视角相关性,有效缓解特征混淆。在此基础上,CDTS采用双路径对比学习,在保持共享特征空间一致性的同时进一步优化各子空间。大量实验表明,与有监督方法相比,CDIKTNet在完全监督条件下达到了最先进性能,并在少样本与跨域初始化场景下超越了现有的无监督方法。