Localization has been a challenging task for autonomous navigation. A loop detection algorithm must overcome environmental changes for the place recognition and re-localization of robots. Therefore, deep learning has been extensively studied for the consistent transformation of measurements into localization descriptors. Street view images are easily accessible; however, images are vulnerable to appearance changes. LiDAR can robustly provide precise structural information. However, constructing a point cloud database is expensive, and point clouds exist only in limited places. Different from previous works that train networks to produce shared embedding directly between the 2D image and 3D point cloud, we transform both data into 2.5D depth images for matching. In this work, we propose a novel cross-matching method, called (LC)$^2$, for achieving LiDAR localization without a prior point cloud map. To this end, LiDAR measurements are expressed in the form of range images before matching them to reduce the modality discrepancy. Subsequently, the network is trained to extract localization descriptors from disparity and range images. Next, the best matches are employed as a loop factor in a pose graph. Using public datasets that include multiple sessions in significantly different lighting conditions, we demonstrated that LiDAR-based navigation systems could be optimized from image databases and vice versa.
翻译:自主导航中的定位一直是一项具有挑战性的任务。回环检测算法必须克服环境变化以实现机器人的地点识别与重定位。因此,深度学习被广泛研究用于将测量数据一致地转换为定位描述子。街景图像易于获取,但图像易受外观变化影响。激光雷达能鲁棒地提供精确的结构信息,然而构建点云数据库成本高昂,且点云仅存在于有限区域。不同于先前直接训练网络生成二维图像与三维点云之间共享嵌入的工作,我们将两种数据均转换为2.5维深度图像进行匹配。本文提出一种新颖的交叉匹配方法(LC)$^2$,旨在无需先验点云地图的情况下实现激光雷达定位。为此,激光雷达测量数据首先以距离图像形式表达以降低模态差异,随后训练网络从视差图像和距离图像中提取定位描述子,进而将最佳匹配作为位姿图中的回环因子。利用包含多个采集时段且光照条件显著不同的公开数据集,我们证明了基于激光雷达的导航系统可通过图像数据库进行优化,反之亦然。