PAPI-Reg：面向LiDAR点云与相机图像高效跨模态配准的块到像素解决方案 (PAPI-Reg: Patch-to-Pixel Solution for Efficient Cross-Modal Registration between LiDAR Point Cloud and Camera Image)

The primary requirement for cross-modal data fusion is the precise alignment of data from different sensors. However, the calibration between LiDAR point clouds and camera images is typically time-consuming and needs external calibration board or specific environmental features. Cross-modal registration effectively solves this problem by aligning the data directly without requiring external calibration. However, due to the domain gap between the point cloud and the image, existing methods rarely achieve satisfactory registration accuracy while maintaining real-time performance. To address this issue, we propose a framework that projects point clouds into several 2D representations for matching with camera images, which not only leverages the geometric characteristic of LiDAR point clouds more effectively but also bridge the domain gap between the point cloud and image. Moreover, to tackle the challenges of cross modal differences and the limited overlap between LiDAR point clouds and images in the image matching task, we introduce a multi-scale feature extraction network to effectively extract features from both camera images and the projection maps of LiDAR point cloud. Additionally, we propose a patch-to-pixel matching network to provide more effective supervision and achieve higher accuracy. We validate the performance of our model through experiments on the KITTI and nuScenes datasets. Our network achieves real-time performance and extremely high registration accuracy. On the KITTI dataset, our model achieves a registration accuracy rate of over 99\%.

翻译：跨模态数据融合的首要前提是实现不同传感器数据的精确对齐。然而，激光雷达点云与相机图像之间的标定通常耗时且需要外部标定板或特定环境特征。跨模态配准通过直接对齐数据而无需外部标定，有效解决了这一问题。但由于点云与图像之间存在域差异，现有方法难以在保持实时性能的同时达到令人满意的配准精度。为解决该问题，我们提出一种将点云投影为多种二维表示以与相机图像匹配的框架，该框架不仅更有效地利用了激光雷达点云的几何特性，同时弥合了点云与图像之间的域差异。此外，为应对图像匹配任务中跨模态差异及激光雷达点云与图像重叠区域有限的挑战，我们引入了多尺度特征提取网络，以有效提取相机图像和激光雷达点云投影图的特征。在此基础上，我们进一步提出块到像素匹配网络，以提供更有效的监督并实现更高精度。我们在KITTI和nuScenes数据集上通过实验验证了模型性能。我们的网络实现了实时处理能力与极高的配准精度。在KITTI数据集上，本模型的配准准确率超过99%。

相关内容

点云

关注 50

根据激光测量原理得到的点云，包括三维坐标（XYZ）和激光反射强度（Intensity）。根据摄影测量原理得到的点云，包括三维坐标（XYZ）和颜色信息（RGB）。结合激光测量和摄影测量原理得到点云，包括三维坐标（XYZ）、激光反射强度（Intensity）和颜色信息（RGB）。在获取物体表面每个采样点的空间坐标后，得到的是一个点的集合，称之为“点云”(Point Cloud)

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日