Large-scale LiDAR mappings and localization leverage place recognition techniques to mitigate odometry drifts, ensuring accurate mapping. These techniques utilize scene representations from LiDAR point clouds to identify previously visited sites within a database. Local descriptors, assigned to each point within a point cloud, are aggregated to form a scene representation for the point cloud. These descriptors are also used to re-rank the retrieved point clouds based on geometric fitness scores. We propose SALSA, a novel, lightweight, and efficient framework for LiDAR place recognition. It consists of a Sphereformer backbone that uses radial window attention to enable information aggregation for sparse distant points, an adaptive self-attention layer to pool local descriptors into tokens, and a multi-layer-perceptron Mixer layer for aggregating the tokens to generate a scene descriptor. The proposed framework outperforms existing methods on various LiDAR place recognition datasets in terms of both retrieval and metric localization while operating in real-time.
翻译:大规模激光雷达建图与定位利用位置识别技术来减轻里程计漂移,确保建图精度。这些技术利用来自激光雷达点云的场景表示,在数据库中识别先前访问过的地点。分配给点云中每个点的局部描述符被聚合以形成该点云的场景表示。这些描述符还用于根据几何拟合分数对检索到的点云进行重排序。我们提出了SALSA,一种新颖、轻量且高效的激光雷达位置识别框架。它包含一个Sphereformer骨干网络,该网络使用径向窗口注意力实现稀疏远距离点的信息聚合;一个自适应自注意力层,用于将局部描述符池化为令牌;以及一个多层感知机Mixer层,用于聚合令牌以生成场景描述符。所提出的框架在各种激光雷达位置识别数据集上,在检索和度量定位方面均优于现有方法,同时能够实时运行。