CoFiI2P: Coarse-to-Fine Correspondences for Image-to-Point Cloud Registration

Image-to-point cloud (I2P) registration is a fundamental task in the field of autonomous vehicles and transportation systems for cross-modality data fusion and localization. Existing I2P registration methods estimate correspondences at the point/pixel level, often overlooking global alignment. However, I2P matching can easily converge to a local optimum when performed without high-level guidance from global constraints. To address this issue, this paper introduces CoFiI2P, a novel I2P registration network that extracts correspondences in a coarse-to-fine manner to achieve the globally optimal solution. First, the image and point cloud data are processed through a Siamese encoder-decoder network for hierarchical feature extraction. Second, a coarse-to-fine matching module is designed to leverage these features and establish robust feature correspondences. Specifically, In the coarse matching phase, a novel I2P transformer module is employed to capture both homogeneous and heterogeneous global information from the image and point cloud data. This enables the estimation of coarse super-point/super-pixel matching pairs with discriminative descriptors. In the fine matching module, point/pixel pairs are established with the guidance of super-point/super-pixel correspondences. Finally, based on matching pairs, the transform matrix is estimated with the EPnP-RANSAC algorithm. Extensive experiments conducted on the KITTI dataset demonstrate that CoFiI2P achieves impressive results, with a relative rotation error (RRE) of 1.14 degrees and a relative translation error (RTE) of 0.29 meters. These results represent a significant improvement of 84\% in RRE and 89\% in RTE compared to the current state-of-the-art (SOTA) method. Qualitative results are available at https://youtu.be/ovbedasXuZE. The source code will be publicly released at https://github.com/kang-1-2-3/CoFiI2P.

翻译：图像到点云（I2P）配准是自动驾驶与交通系统领域中进行跨模态数据融合与定位的基础任务。现有I2P配准方法通常在点/像素级别估计对应关系，往往忽略了全局对齐。然而，缺乏全局约束的高层级引导时，I2P匹配容易收敛到局部最优。针对该问题，本文提出CoFiI2P——一种新型I2P配准网络，通过从粗到细的方式提取对应关系以获取全局最优解。首先，通过孪生编码器-解码器网络对图像和点云数据进行层次化特征提取；其次，设计从粗到细的匹配模块以利用这些特征建立鲁棒的特征对应关系。具体而言，在粗匹配阶段，采用新型I2P Transformer模块同时捕获图像与点云数据中的同质和异质全局信息，从而通过判别性描述子估计粗粒度的超点/超像素匹配对；在精细匹配阶段，基于超点/超像素对应关系的引导建立点/像素配准对。最后，基于匹配对，通过EPnP-RANSAC算法估计变换矩阵。在KITTI数据集上的大量实验表明，CoFiI2P取得了显著性能：相对旋转误差（RRE）为1.14度，相对平移误差（RTE）为0.29米，相较当前最先进方法（SOTA），RRE与RTE分别提升84%和89%。定性结果参见https://youtu.be/ovbedasXuZE，源代码将于https://github.com/kang-1-2-3/CoFiI2P公开。