We tackle the challenge of LiDAR-based place recognition, which traditionally depends on costly and time-consuming prior 3D maps. To overcome this, we first construct LiRSI-XA dataset, which encompasses approximately $110,000$ remote sensing submaps and $13,000$ LiDAR point cloud submaps captured in urban scenes, and propose a novel method, L2RSI, for cross-view LiDAR place recognition using high-resolution Remote Sensing Imagery. This approach enables large-scale localization capabilities at a reduced cost by leveraging readily available overhead images as map proxies. L2RSI addresses the dual challenges of cross-view and cross-modal place recognition by learning feature alignment between point cloud submaps and remote sensing submaps in the semantic domain. Additionally, we introduce a novel probability propagation method based on particle estimation to refine position predictions, effectively leveraging temporal and spatial information. This approach enables large-scale retrieval and cross-scene generalization without fine-tuning. Extensive experiments on LiRSI-XA demonstrate that, within a $100km^2$ retrieval range, L2RSI accurately localizes $83.27\%$ of point cloud submaps within a $30m$ radius for top-$1$ retrieved location. Our project page is publicly available at https://shizw695.github.io/L2RSI/.
翻译:我们致力于解决基于激光雷达的地点识别挑战,该任务传统上依赖于成本高昂且耗时的先验三维地图。为克服这一局限,我们首先构建了LiRSI-XA数据集,该数据集包含约$110,000$幅城市场景遥感子图和$13,000$幅激光雷达点云子图,并提出了一种新颖方法L2RSI,利用高分辨率遥感影像实现跨视角激光雷达地点识别。该方法通过将易获取的俯视图像作为地图代理,以较低成本实现大规模定位能力。L2RSI通过在语义域学习点云子图与遥感子图之间的特征对齐,解决了跨视角与跨模态地点识别的双重挑战。此外,我们引入了一种基于粒子估计的新型概率传播方法,以优化位置预测,有效利用时空信息。该方法无需微调即可实现大规模检索与跨场景泛化。在LiRSI-XA数据集上的大量实验表明,在$100km^2$检索范围内,L2RSI能够将$83.27\\%$的点云子图准确定位在top-$1$检索位置的$30m$半径内。我们的项目页面已公开于https://shizw695.github.io/L2RSI/。