We present CrossLoc3D, a novel 3D place recognition method that solves a large-scale point matching problem in a cross-source setting. Cross-source point cloud data corresponds to point sets captured by depth sensors with different accuracies or from different distances and perspectives. We address the challenges in terms of developing 3D place recognition methods that account for the representation gap between points captured by different sources. Our method handles cross-source data by utilizing multi-grained features and selecting convolution kernel sizes that correspond to most prominent features. Inspired by the diffusion models, our method uses a novel iterative refinement process that gradually shifts the embedding spaces from different sources to a single canonical space for better metric learning. In addition, we present CS-Campus3D, the first 3D aerial-ground cross-source dataset consisting of point cloud data from both aerial and ground LiDAR scans. The point clouds in CS-Campus3D have representation gaps and other features like different views, point densities, and noise patterns. We show that our CrossLoc3D algorithm can achieve an improvement of 4.74% - 15.37% in terms of the top 1 average recall on our CS-Campus3D benchmark and achieves performance comparable to state-of-the-art 3D place recognition method on the Oxford RobotCar. The code and CS-CAMPUS3D benchmark will be available at github.com/rayguan97/crossloc3d.
翻译:我们提出CrossLoc3D,一种解决跨源场景下大规模点云匹配问题的新型三维地点识别方法。跨源点云数据指由不同精度或不同距离、视角的深度传感器采集的点集集合。我们针对开发能弥合不同来源点云表示差异的三维地点识别方法所面临的挑战展开研究。该方法通过利用多粒度特征并选择对应最显著特征的卷积核尺寸来处理跨源数据。受扩散模型启发,我们采用新型迭代精化流程,逐步将不同来源的嵌入空间映射至单一规范空间以优化度量学习。此外,我们提出CS-Campus3D——首个包含空中与地面LiDAR扫描点云数据的三维空-地跨源数据集。该数据集的点云存在表示差异及视角、点密度、噪声模式等特征差异。实验表明,我们的CrossLoc3D算法在CS-Campus3D基准上的top 1平均召回率提升4.74%-15.37%,并在牛津RobotCar数据集上取得与最先进三维地点识别方法相当的性能。代码与CS-CAMPUS3D基准将发布至github.com/rayguan97/crossloc3d。