Relocalization is the basis of map-based localization algorithms. Camera and LiDAR map-based methods are pervasive since their robustness under different scenarios. Generally, mapping and localization using the same sensor have better accuracy since matching features between the same type of data is easier. However, due to the camera's lack of 3D information and the high cost of LiDAR, cross-media methods are developing, which combined live image data and Lidar map. Although matching features between different media is challenging, we believe cross-media is the tendency for AV relocalization since its low cost and accuracy can be comparable to the same-sensor-based methods. In this paper, we propose CMSG, a novel cross-media algorithm for AV relocalization tasks. Semantic features are utilized for better interpretation the correlation between point clouds and image features. What's more, abstracted semantic graph nodes are introduced, and a graph network architecture is integrated to better extract the similarity of semantic features. Validation experiments are conducted on the KITTI odometry dataset. Our results show that CMSG can have comparable or even better accuracy compared to current single-sensor-based methods at a speed of 25 FPS on NVIDIA 1080 Ti GPU.
翻译:重定位是基于地图定位算法的基础。基于相机和激光雷达地图的方法因其在不同场景下的鲁棒性而广泛应用。通常,使用同一传感器进行地图构建和定位具有更高的精度,因为相同类型数据间的特征匹配更为容易。然而,由于相机缺乏三维信息且激光雷达成本高昂,跨媒体方法正在发展,该方法结合了实时图像数据与激光雷达地图。尽管不同媒介间的特征匹配具有挑战性,但我们认为跨媒体是自动驾驶车辆重定位的发展趋势,因其成本低廉且精度可与基于同传感器的方法相媲美。本文提出了一种用于自动驾驶车辆重定位任务的新型跨媒体算法CMSG。语义特征被用于更好地解释点云与图像特征之间的相关性。此外,引入了抽象的语义图节点,并集成了图网络架构以更有效地提取语义特征的相似性。在KITTI里程计数据集上进行了验证实验。结果显示,在NVIDIA 1080 Ti GPU上以25 FPS的速度运行时,CMSG能够达到与当前基于单传感器的方法相当甚至更优的精度。