Aerial-view geo-localization tends to determine an unknown position through matching the drone-view image with the geo-tagged satellite-view image. This task is mostly regarded as an image retrieval problem. The key underpinning this task is to design a series of deep neural networks to learn discriminative image descriptors. However, existing methods meet large performance drops under realistic weather, such as rain and fog, since they do not take the domain shift between the training data and multiple test environments into consideration. To minor this domain gap, we propose a Multiple-environment Self-adaptive Network (MuSe-Net) to dynamically adjust the domain shift caused by environmental changing. In particular, MuSe-Net employs a two-branch neural network containing one multiple-environment style extraction network and one self-adaptive feature extraction network. As the name implies, the multiple-environment style extraction network is to extract the environment-related style information, while the self-adaptive feature extraction network utilizes an adaptive modulation module to dynamically minimize the environment-related style gap. Extensive experiments on two widely-used benchmarks, i.e., University-1652 and CVUSA, demonstrate that the proposed MuSe-Net achieves a competitive result for geo-localization in multiple environments. Furthermore, we observe that the proposed method also shows great potential to the unseen extreme weather, such as mixing the fog, rain and snow.
翻译:航拍视角地理定位旨在通过匹配无人机视角图像与带地理标签的卫星视角图像来确定未知位置。该任务通常被视为图像检索问题,其核心在于设计一系列深度神经网络来学习具有区分性的图像描述符。然而,现有方法在雨天、雾天等真实天气条件下会出现较大的性能下降,因为它们未考虑训练数据与多种测试环境之间的域偏移。为缩小这一域差距,我们提出了一种多环境自适应网络(MuSe-Net),以动态调整由环境变化引起的域偏移。具体而言,MuSe-Net采用双分支神经网络结构,包含一个多环境风格提取网络和一个自适应特征提取网络。顾名思义,多环境风格提取网络用于提取与环境相关的风格信息,而自适应特征提取网络则利用自适应调制模块动态缩小与环境相关的风格差异。在两个广泛使用的基准数据集(University-1652和CVUSA)上进行的大量实验表明,所提出的MuSe-Net在多种环境下的地理定位任务中取得了具有竞争力的结果。此外,我们观察到该方法在应对未见过的极端天气(如雾、雨、雪混合场景)时也展现出巨大潜力。