Visual Place Recognition is a task that aims to predict the coordinates of an image (called query) based solely on visual clues. Most commonly, a retrieval approach is adopted, where the query is matched to the most similar images from a large database of geotagged photos, using learned global descriptors. Despite recent advances, recognizing the same place when the query comes from a significantly different distribution is still a major hurdle for state of the art retrieval methods. Examples are heavy illumination changes (e.g. night-time images) or substantial occlusions (e.g. transient objects). In this work we explore whether re-ranking methods based on spatial verification can tackle these challenges, following the intuition that local descriptors are inherently more robust than global features to domain shifts. To this end, we provide a new, comprehensive benchmark on current state of the art models. We also introduce two new demanding datasets with night and occluded queries, to be matched against a city-wide database. Code and datasets are available at https://github.com/gbarbarani/re-ranking-for-VPR.
翻译:视觉地点识别是一项仅依靠视觉线索预测图像(称为查询图像)坐标的任务。通常采用检索方法:利用学习得到的全局描述符,将查询图像与包含大量地理标记照片的数据库中最相似的图像进行匹配。尽管近期取得了进展,但当查询图像来自显著不同的数据分布时,识别同一地点仍是现有最先进检索方法面临的主要挑战,例如剧烈的光照变化(如夜间图像)或严重遮挡(如临时物体)。本研究探索基于空间验证的重排序方法能否应对这些挑战,其直觉依据在于局部描述符本质上比全局特征对域偏移具有更强的鲁棒性。为此,我们为当前最先进模型提供了一个新的综合性基准测试,并引入了两个包含夜间和遮挡查询图像的高难度新数据集,需要与城市级数据库进行匹配。代码与数据集已开源至https://github.com/gbarbarani/re-ranking-for-VPR。