Place recognition plays a crucial role in the fields of robotics and computer vision, finding applications in areas such as autonomous driving, mapping, and localization. Place recognition identifies a place using query sensor data and a known database. One of the main challenges is to develop a model that can deliver accurate results while being robust to environmental variations. We propose two multi-modal place recognition models, namely PRFusion and PRFusion++. PRFusion utilizes global fusion with manifold metric attention, enabling effective interaction between features without requiring camera-LiDAR extrinsic calibrations. In contrast, PRFusion++ assumes the availability of extrinsic calibrations and leverages pixel-point correspondences to enhance feature learning on local windows. Additionally, both models incorporate neural diffusion layers, which enable reliable operation even in challenging environments. We verify the state-of-the-art performance of both models on three large-scale benchmarks. Notably, they outperform existing models by a substantial margin of +3.0 AR@1 on the demanding Boreas dataset. Furthermore, we conduct ablation studies to validate the effectiveness of our proposed methods. The codes are available at: https://github.com/sijieaaa/PRFusion
翻译:地点识别在机器人学和计算机视觉领域扮演着关键角色,广泛应用于自动驾驶、地图构建与定位等场景。该任务通过查询传感器数据与已知数据库进行地点匹配。其核心挑战之一在于构建既能保证识别精度,又能对环境变化保持鲁棒性的模型。本文提出两种多模态地点识别模型:PRFusion与PRFusion++。PRFusion采用全局融合机制与流形度量注意力,无需相机-激光雷达外参标定即可实现特征间有效交互;而PRFusion++则基于已知外参标定条件,利用像素-点云对应关系增强局部窗口的特征学习能力。两种模型均引入神经扩散层,使其在复杂环境下仍能保持稳定性能。我们在三个大规模基准数据集上验证了两种模型的先进性能:在极具挑战性的Boreas数据集上,它们以+3.0 AR@1的显著优势超越现有模型。此外,通过消融实验验证了所提方法的有效性。代码已开源:https://github.com/sijieaaa/PRFusion