Place recognition is a challenging but crucial task in robotics. Current description-based methods may be limited by representation capabilities, while pairwise similarity-based methods require exhaustive searches, which is time-consuming. In this paper, we present a novel coarse-to-fine approach to address these problems, which combines BEV (Bird's Eye View) feature extraction, coarse-grained matching and fine-grained verification. In the coarse stage, our approach utilizes an attention-guided network to generate attention-guided descriptors. We then employ a fast affinity-based candidate selection process to identify the Top-K most similar candidates. In the fine stage, we estimate pairwise overlap among the narrowed-down place candidates to determine the final match. Experimental results on the KITTI and KITTI-360 datasets demonstrate that our approach outperforms state-of-the-art methods. The code will be released publicly soon.
翻译:地点识别是机器人学中一项具有挑战性但至关重要的任务。当前基于描述符的方法可能受限于表征能力,而基于成对相似度的方法需要进行穷举搜索,这非常耗时。本文提出了一种新颖的由粗到精的方法来解决这些问题,该方法结合了BEV(鸟瞰图)特征提取、粗粒度匹配和细粒度验证。在粗粒度阶段,我们的方法利用一个注意力引导网络来生成注意力引导描述符。随后,我们采用一个基于快速亲和力的候选选择过程来识别Top-K个最相似的候选地点。在细粒度阶段,我们对筛选后的候选地点进行成对重叠度估计,以确定最终匹配。在KITTI和KITTI-360数据集上的实验结果表明,我们的方法优于当前最先进的方法。代码将很快公开发布。