I2P-Rec: Recognizing Images on Large-scale Point Cloud Maps through Bird's Eye View Projections

Place recognition is an important technique for autonomous cars to achieve full autonomy since it can provide an initial guess to online localization algorithms. Although current methods based on images or point clouds have achieved satisfactory performance, localizing the images on a large-scale point cloud map remains a fairly unexplored problem. This cross-modal matching task is challenging due to the difficulty in extracting consistent descriptors from images and point clouds. In this paper, we propose the I2P-Rec method to solve the problem by transforming the cross-modal data into the same modality. Specifically, we leverage on the recent success of depth estimation networks to recover point clouds from images. We then project the point clouds into Bird's Eye View (BEV) images. Using the BEV image as an intermediate representation, we extract global features with a Convolutional Neural Network followed by a NetVLAD layer to perform matching. The experimental results evaluated on the KITTI dataset show that, with only a small set of training data, I2P-Rec achieves recall rates at Top-1\% over 80\% and 90\%, when localizing monocular and stereo images on point cloud maps, respectively. We further evaluate I2P-Rec on a 1 km trajectory dataset collected by an autonomous logistics car and show that I2P-Rec can generalize well to previously unseen environments.

翻译：地点识别是自动驾驶汽车实现完全自主的重要技术，因为它能为在线定位算法提供初始估计。尽管当前基于图像或点云的方法已取得令人满意的性能，但在大规模点云地图上定位图像仍是一个相当未被探索的问题。这种跨模态匹配任务具有挑战性，原因在于难以从图像和点云中提取一致的描述符。本文提出 I2P-Rec 方法，通过将跨模态数据转换为同一模态来解决此问题。具体而言，我们利用深度估计网络的最新成功，从图像中恢复点云，然后将点云投影到鸟瞰图中。以 BEV 图像作为中间表示，我们通过卷积神经网络提取全局特征，并紧跟 NetVLAD 层进行匹配。在 KITTI 数据集上的实验结果表明，仅用少量训练数据，I2P-Rec 在将单目和立体图像定位到点云地图上时，分别达到了超过 80% 和 90% 的 Top-1% 召回率。我们进一步在自动驾驶物流车采集的 1 公里轨迹数据集上评估 I2P-Rec，表明它能很好地泛化到未见过的环境。

相关内容

点云

关注 50

根据激光测量原理得到的点云，包括三维坐标（XYZ）和激光反射强度（Intensity）。根据摄影测量原理得到的点云，包括三维坐标（XYZ）和颜色信息（RGB）。结合激光测量和摄影测量原理得到点云，包括三维坐标（XYZ）、激光反射强度（Intensity）和颜色信息（RGB）。在获取物体表面每个采样点的空间坐标后，得到的是一个点的集合，称之为“点云”(Point Cloud)

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日