Large-scale LiDAR mappings and localization leverage place recognition techniques to mitigate odometry drifts, ensuring accurate mapping. These techniques utilize scene representations from LiDAR point clouds to identify previously visited sites within a database. Local descriptors, assigned to each point within a point cloud, are aggregated to form a scene representation for the point cloud. These descriptors are also used to re-rank the retrieved point clouds based on geometric fitness scores. We propose SALSA, a novel, lightweight, and efficient framework for LiDAR place recognition. It consists of a Sphereformer backbone that uses radial window attention to enable information aggregation for sparse distant points, an adaptive self-attention layer to pool local descriptors into tokens, and a multi-layer-perceptron Mixer layer for aggregating the tokens to generate a scene descriptor. The proposed framework outperforms existing methods on various LiDAR place recognition datasets in terms of both retrieval and metric localization while operating in real-time.
翻译:大规模LiDAR建图与定位技术借助地点识别方法来减少里程计漂移,从而确保建图精度。这些方法利用LiDAR点云中的场景表征,在数据库中识别先前访问过的地点。通过为点云中的每个点分配局部描述符,并将其聚合以形成该点云的场景表征。这些描述符还可用于基于几何匹配度分数对检索到的点云进行重排序。我们提出SALSA——一种新颖、轻量且高效的LiDAR地点识别框架。该框架包含三个核心组件:采用径向窗口注意力实现稀疏远距离点信息聚合的Sphereformer主干网络、将局部描述符池化为令牌的自适应自注意力层,以及通过多层感知机Mixer层聚合令牌以生成场景描述符的模块。所提出的框架在多种LiDAR地点识别数据集上,无论是检索性能还是度量定位精度均超越现有方法,且能够实时运行。