LiDARs are widely used for mapping and localization in dynamic environments. However, their high cost limits their widespread adoption. On the other hand, monocular localization in LiDAR maps using inexpensive cameras is a cost-effective alternative for large-scale deployment. Nevertheless, most existing approaches struggle to generalize to new sensor setups and environments, requiring retraining or fine-tuning. In this paper, we present CMRNext, a novel approach for camera-LIDAR matching that is independent of sensor-specific parameters, generalizable, and can be used in the wild for monocular localization in LiDAR maps and camera-LiDAR extrinsic calibration. CMRNext exploits recent advances in deep neural networks for matching cross-modal data and standard geometric techniques for robust pose estimation. We reformulate the point-pixel matching problem as an optical flow estimation problem and solve the Perspective-n-Point problem based on the resulting correspondences to find the relative pose between the camera and the LiDAR point cloud. We extensively evaluate CMRNext on six different robotic platforms, including three publicly available datasets and three in-house robots. Our experimental evaluations demonstrate that CMRNext outperforms existing approaches on both tasks and effectively generalizes to previously unseen environments and sensor setups in a zero-shot manner. We make the code and pre-trained models publicly available at http://cmrnext.cs.uni-freiburg.de .
翻译:激光雷达广泛应用于动态环境中的建图与定位,但其高昂成本限制了大规模推广。相比之下,利用廉价相机在激光雷达地图中进行单目定位是一种经济高效的替代方案。然而,现有方法大多难以泛化至新型传感器配置与环境,通常需重新训练或微调。本文提出CMRNext——一种与传感器特定参数无关、可泛化且适用于野外的相机-激光雷达匹配新方法,可用于激光雷达地图中的单目定位以及相机-激光雷达外部标定。CMRNext结合深度神经网络在跨模态数据匹配领域的最新进展,以及经典几何方法实现鲁棒位姿估计。我们将点-像素匹配问题重新表述为光流估计问题,并基于所得对应关系求解透视-n-点问题,以确定相机与激光雷达点云之间的相对位姿。我们在六种不同机器人平台上对CMRNext进行广泛评估,包括三个公开数据集与三个自研机器人。实验结果表明,CMRNext在两项任务上均优于现有方法,并能以零样本方式有效泛化至未见环境与传感器配置。代码与预训练模型已开源至 http://cmrnext.cs.uni-freiburg.de。