Cameras and LiDARs are both important sensors for autonomous driving, playing critical roles in 3D object detection. Camera-LiDAR Fusion has been a prevalent solution for robust and accurate driving perception. In contrast to the vast majority of existing arts that focus on how to improve the performance of 3D target detection through cross-modal schemes, deep learning algorithms, and training tricks, we devote attention to the impact of sensor configurations on the performance of learning-based methods. To achieve this, we propose a unified information-theoretic surrogate metric for camera and LiDAR evaluation based on the proposed sensor perception model. We also design an accelerated high-quality framework for data acquisition, model training, and performance evaluation that functions with the CARLA simulator. To show the correlation between detection performance and our surrogate metrics, We conduct experiments using several camera-LiDAR placements and parameters inspired by self-driving companies and research institutions. Extensive experimental results of representative algorithms on nuScenes dataset validate the effectiveness of our surrogate metric, demonstrating that sensor configurations significantly impact point-cloud-image fusion based detection models, which contribute up to 30% discrepancy in terms of the average precision.
翻译:相机与激光雷达均为自动驾驶中的重要传感器,在三维目标检测中扮演关键角色。相机-激光雷达融合已成为实现鲁棒且精准驾驶感知的主流方案。与现有绝大多数关注如何通过跨模态方案、深度学习算法及训练技巧提升三维目标检测性能的研究不同,本文聚焦于传感器配置对基于学习方法性能的影响。为此,我们基于所提出的传感器感知模型,提出一种统一的信息论替代度量指标来评估相机与激光雷达。同时,我们设计了一套适用于CARLA仿真器的高效加速数据采集、模型训练及性能评估框架。为揭示检测性能与替代度量指标之间的关联,我们参照自动驾驶企业和研究机构的配置,采用多种相机-激光雷达布局及参数开展实验。在nuScenes数据集上对代表性算法的大量实验结果验证了替代度量指标的有效性,表明传感器配置对基于点云-图像融合的检测模型有显著影响,其平均精度差异最高可达30%。