Perception sensor models are essential elements of automotive simulation environments; they also serve as powerful tools for creating synthetic datasets to train deep learning-based perception models. Developing realistic perception sensor models poses a significant challenge due to the large gap between simulated sensor data and real-world sensor outputs, known as the sim-to-real gap. To address this problem, learning-based models have emerged as promising solutions in recent years, with unparalleled potential to map low-fidelity simulated sensor data into highly realistic outputs. Motivated by this potential, this paper focuses on sim-to-real mapping of Lidar point clouds, a widely used perception sensor in automated driving systems. We introduce a novel Contrastive-Learning-based Sim-to-Real mapping framework, namely CLS2R, inspired by the recent advancements in image-to-image translation techniques. The proposed CLS2R framework employs a lossless representation of Lidar point clouds, considering all essential Lidar attributes such as depth, reflectance, and raydrop. We extensively evaluate the proposed framework, comparing it with state-of-the-art image-to-image translation methods using a diverse range of metrics to assess realness, faithfulness, and the impact on the performance of a downstream task. Our results show that CLS2R demonstrates superior performance across nearly all metrics. Source code is available at https://github.com/hamedhaghighi/CLS2R.git.
翻译:感知传感器模型是汽车仿真环境的关键组成部分,也是生成合成数据集以训练基于深度学习的感知模型的有力工具。由于仿真传感器数据与真实传感器输出之间存在显著差距(即Sim-to-Real差距),开发逼真的感知传感器模型面临重大挑战。近年来,为解决这一问题,基于学习的模型作为有前景的方案崭露头角,其具有将低保真仿真传感器数据映射为高真实度输出的独特潜力。受此启发,本文聚焦于自动驾驶系统中广泛使用的感知传感器——激光雷达点云的Sim-to-Real映射。受近期图像到图像翻译技术的启发,我们提出了一种新颖的基于对比学习的Sim-to-Real映射框架CLS2R。该框架采用激光雷达点云的无损表示形式,综合考虑深度、反射率和射线丢失等所有关键激光雷达属性。我们使用多种评估指标对提出的框架进行广泛评估,并与最先进的图像到图像翻译方法进行比较,以评估其真实性、忠实度及对下游任务性能的影响。结果表明,CLS2R在几乎所有指标上均展现出优越性能。源代码已开源至https://github.com/hamedhaghighi/CLS2R.git。