LiDAR devices are widely used in autonomous driving scenarios and researches on 3D point cloud achieve remarkable progress over the past years. However, deep learning-based methods heavily rely on the annotation data and often face the domain generalization problem. Unlike 2D images whose domains are usually related to the texture information, the feature extracted from the 3D point cloud is affected by the distribution of the points. Due to the lack of a 3D domain adaptation benchmark, the common practice is to train the model on one benchmark (e.g, Waymo) and evaluate it on another dataset (e.g. KITTI). However, in this setting, there are two types of domain gaps, the scenarios domain, and sensors domain, making the evaluation and analysis complicated and difficult. To handle this situation, we propose LiDAR Dataset with Cross-Sensors (LiDAR-CS Dataset), which contains large-scale annotated LiDAR point cloud under 6 groups of different sensors but with same corresponding scenarios, captured from hybrid realistic LiDAR simulator. As far as we know, LiDAR-CS Dataset is the first dataset focused on the sensor (e.g., the points distribution) domain gaps for 3D object detection in real traffic. Furthermore, we evaluate and analyze the performance with several baseline detectors on the LiDAR-CS benchmark and show its applications.
翻译:激光雷达设备广泛用于自动驾驶场景,近年来基于三维点云的研究取得了显著进展。然而,基于深度学习的方法严重依赖于标注数据,且常面临领域泛化问题。与二维图像(其领域通常与纹理信息相关)不同,从三维点云中提取的特征受点分布影响。由于缺乏三维领域自适应基准,常见做法是在一个基准(如Waymo)上训练模型,再在另一个数据集(如KITTI)上评估。但此设置中存在场景领域和传感器领域两类领域差距,使得评估与分析复杂而困难。为解决此问题,我们提出跨传感器激光雷达数据集(LiDAR-CS Dataset),该数据集包含在6组不同传感器但相同对应场景下采集的大规模标注激光雷达点云,数据来源于混合真实感激光雷达模拟器。据我们所知,LiDAR-CS数据集是首个聚焦于真实交通中三维目标检测的传感器(如点分布)领域差距的数据集。此外,我们在LiDAR-CS基准上评估并分析了多个基线检测器的性能,展示了其应用价值。