Precise localization is critical for autonomous vehicles. We present a self-supervised learning method that employs Transformers for the first time for the task of outdoor localization using LiDAR data. We propose a pre-text task that reorganizes the slices of a $360^\circ$ LiDAR scan to leverage its axial properties. Our model, called Slice Transformer, employs multi-head attention while systematically processing the slices. To the best of our knowledge, this is the first instance of leveraging multi-head attention for outdoor point clouds. We additionally introduce the Perth-WA dataset, which provides a large-scale LiDAR map of Perth city in Western Australia, covering $\sim$4km$^2$ area. Localization annotations are provided for Perth-WA. The proposed localization method is thoroughly evaluated on Perth-WA and Appollo-SouthBay datasets. We also establish the efficacy of our self-supervised learning approach for the common downstream task of object classification using ModelNet40 and ScanNN datasets. The code and Perth-WA data will be publicly released.
翻译:精确定位对于自动驾驶汽车至关重要。我们提出了一种自监督学习方法,首次将Transformer应用于使用LiDAR数据的室外定位任务。我们设计了一项预文本任务,通过对360° LiDAR扫描的切片进行重组,充分利用其轴向特性。我们的模型称为切片Transformer,采用多头注意力机制系统化地处理这些切片。据我们所知,这是首次将多头注意力应用于室外点云数据。此外,我们引入了珀斯-WA数据集,该数据集提供了西澳大利亚珀斯市的大规模LiDAR地图,覆盖约4平方公里区域,并提供了定位标注。所提出的定位方法在珀斯-WA和Apollo-SouthBay数据集上进行了充分评估。我们还通过ModelNet40和ScanNN数据集验证了所提出的自监督学习方法在常见下游任务(如物体分类)中的有效性。代码和珀斯-WA数据将公开发布。