We are interested in understanding whether retrieval-based localization approaches are good enough in the context of self-driving vehicles. Towards this goal, we introduce Pit30M, a new image and LiDAR dataset with over 30 million frames, which is 10 to 100 times larger than those used in previous work. Pit30M is captured under diverse conditions (i.e., season, weather, time of the day, traffic), and provides accurate localization ground truth. We also automatically annotate our dataset with historical weather and astronomical data, as well as with image and LiDAR semantic segmentation as a proxy measure for occlusion. We benchmark multiple existing methods for image and LiDAR retrieval and, in the process, introduce a simple, yet effective convolutional network-based LiDAR retrieval method that is competitive with the state of the art. Our work provides, for the first time, a benchmark for sub-metre retrieval-based localization at city scale. The dataset, its Python SDK, as well as more information about the sensors, calibration, and metadata, are available on the project website: https://pit30m.github.io/
翻译:我们旨在探究在自动驾驶场景中,基于检索的定位方法是否足够可靠。为此,我们提出了Pit30M——一个包含超过3000万帧图像与激光雷达数据的新数据集,其规模是此前工作的10至100倍。该数据集在多样化条件(如季节、天气、时段、交通状况)下采集,并提供了精确的定位真值。我们同时利用历史气象与天文数据对数据集进行了自动标注,并以图像和激光雷达语义分割作为遮挡程度的代理指标。我们对多种现有图像与激光雷达检索方法进行了基准测试,过程中提出了一种简单而有效的基于卷积网络的激光雷达检索方法,其表现与当前最优技术相当。本研究首次为城市级亚米级检索定位提供了标准基准。数据集、Python开发工具包以及更多关于传感器、标定和元数据的信息,均可在项目网站获取:https://pit30m.github.io/