LiDARs are widely used for mapping and localization in dynamic environments. However, their high cost limits their widespread adoption. On the other hand, monocular localization in LiDAR maps using inexpensive cameras is a cost-effective alternative for large-scale deployment. Nevertheless, most existing approaches struggle to generalize to new sensor setups and environments, requiring retraining or fine-tuning. In this paper, we present CMRNext, a novel approach for camera-LIDAR matching that is independent of sensor-specific parameters, generalizable, and can be used in the wild for monocular localization in LiDAR maps and camera-LiDAR extrinsic calibration. CMRNext exploits recent advances in deep neural networks for matching cross-modal data and standard geometric techniques for robust pose estimation. We reformulate the point-pixel matching problem as an optical flow estimation problem and solve the Perspective-n-Point problem based on the resulting correspondences to find the relative pose between the camera and the LiDAR point cloud. We extensively evaluate CMRNext on six different robotic platforms, including three publicly available datasets and three in-house robots. Our experimental evaluations demonstrate that CMRNext outperforms existing approaches on both tasks and effectively generalizes to previously unseen environments and sensor setups in a zero-shot manner. We make the code and pre-trained models publicly available at http://cmrnext.cs.uni-freiburg.de .
翻译:激光雷达在动态环境下的建图与定位中应用广泛,但其高昂成本限制了大规模普及。相比之下,利用低成本相机在激光雷达地图中进行单目定位,为实现大规模部署提供了一种经济高效的替代方案。然而,现有方法大多难以泛化至新的传感器配置与环境,往往需要重新训练或微调。本文提出CMRNext,一种独立于传感器特定参数、可泛化的相机-激光雷达匹配新方法,可直接应用于野外环境下的激光雷达地图单目定位及相机-激光雷达外部标定任务。CMRNext融合了跨模态数据匹配的深度神经网络最新进展与鲁棒位姿估计的经典几何技术,将点云-像素匹配问题重新定义为光流估计问题,并基于生成的对应关系求解透视n点问题,从而确定相机与激光雷达点云间的相对位姿。我们在六个不同机器人平台上对CMRNext进行了全面评估,包括三个公开数据集和三个内部机器人。实验结果表明,CMRNext在两项任务上均优于现有方法,并能以零样本方式有效泛化至未见过的环境与传感器配置。代码与预训练模型已公开于http://cmrnext.cs.uni-freiburg.de。