LiDAR object detection algorithms based on neural networks for autonomous driving require large amounts of data for training, validation, and testing. As real-world data collection and labeling are time-consuming and expensive, simulation-based synthetic data generation is a viable alternative. However, using simulated data for the training of neural networks leads to a domain shift of training and testing data due to differences in scenes, scenarios, and distributions. In this work, we quantify the sim-to-real domain shift by means of LiDAR object detectors trained with a new scenario-identical real-world and simulated dataset. In addition, we answer the questions of how well the simulated data resembles the real-world data and how well object detectors trained on simulated data perform on real-world data. Further, we analyze point clouds at the target-level by comparing real-world and simulated point clouds within the 3D bounding boxes of the targets. Our experiments show that a significant sim-to-real domain shift exists even for our scenario-identical datasets. This domain shift amounts to an average precision reduction of around 14 % for object detectors trained with simulated data. Additional experiments reveal that this domain shift can be lowered by introducing a simple noise model in simulation. We further show that a simple downsampling method to model real-world physics does not influence the performance of the object detectors.
翻译:基于神经网络的自动驾驶激光雷达目标检测算法需要大量数据用于训练、验证和测试。由于真实世界数据采集和标注耗时且成本高昂,基于仿真的合成数据生成成为可行的替代方案。然而,使用仿真数据训练神经网络会导致训练和测试数据之间存在域偏移,这是由场景、情形和数据分布的差异引起的。本文通过使用新构建的场景一致的真实世界与仿真数据集训练激光雷达目标检测器,量化了从仿真到真实的域偏移。此外,我们回答了仿真数据与真实数据的相似程度,以及基于仿真数据训练的目标检测器在真实数据上的表现如何等问题。我们进一步通过比较目标三维边界框内的真实点云与仿真点云,在目标层级对点云进行分析。实验表明,即使对于场景一致的数据集,从仿真到真实仍存在显著的域偏移。该域偏移导致使用仿真数据训练的目标检测器的平均精度下降约14%。额外实验表明,通过在仿真中引入简单的噪声模型可降低这一域偏移。我们还发现,用于模拟真实世界物理的简单降采样方法不会影响目标检测器的性能。