LiDARs are widely used for mapping and localization in dynamic environments. However, their high cost limits their widespread adoption. On the other hand, monocular localization in LiDAR maps using inexpensive cameras is a cost-effective alternative for large-scale deployment. Nevertheless, most existing approaches struggle to generalize to new sensor setups and environments, requiring retraining or fine-tuning. In this paper, we present CMRNext, a novel approach for camera-LIDAR matching that is independent of sensor-specific parameters, generalizable, and can be used in the wild for monocular localization in LiDAR maps and camera-LiDAR extrinsic calibration. CMRNext exploits recent advances in deep neural networks for matching cross-modal data and standard geometric techniques for robust pose estimation. We reformulate the point-pixel matching problem as an optical flow estimation problem and solve the Perspective-n-Point problem based on the resulting correspondences to find the relative pose between the camera and the LiDAR point cloud. We extensively evaluate CMRNext on six different robotic platforms, including three publicly available datasets and three in-house robots. Our experimental evaluations demonstrate that CMRNext outperforms existing approaches on both tasks and effectively generalizes to previously unseen environments and sensor setups in a zero-shot manner. We make the code and pre-trained models publicly available at http://cmrnext.cs.uni-freiburg.de .
翻译:激光雷达被广泛用于动态环境中的建图与定位,但其高昂成本限制了大规模部署。相比之下,利用廉价相机在激光雷达地图中进行单目定位是一种经济高效的大规模应用方案。然而,现有方法大多难以泛化至新传感器配置与环境,需要重新训练或微调。本文提出CMRNext,一种新颖的相机-激光雷达匹配方法,该方法独立于传感器特定参数、具备泛化能力,可用于开放环境下基于激光雷达地图的单目定位及相机-激光雷达外部标定。CMRNext利用深度神经网络在跨模态数据匹配方面的最新进展,结合经典几何技术实现稳健位姿估计。我们将点-像素匹配问题重新表述为光流估计问题,并基于所得对应关系求解透视n点问题,以获取相机与激光雷达点云间的相对位姿。我们在六个不同机器人平台上对CMRNext进行了全面评估,包括三个公开数据集及三个自研机器人平台。实验结果表明,CMRNext在两项任务中均优于现有方法,并能以零样本方式有效泛化至未见环境及传感器配置。代码与预训练模型已开源:http://cmrnext.cs.uni-freiburg.de