Multimodal sensor fusion methods for 3D object detection have been revolutionizing the autonomous driving research field. Nevertheless, most of these methods heavily rely on dense LiDAR data and accurately calibrated sensors which is often not the case in real-world scenarios. Data from LiDAR and cameras often come misaligned due to the miscalibration, decalibration, or different frequencies of the sensors. Additionally, some parts of the LiDAR data may be occluded and parts of the data may be missing due to hardware malfunction or weather conditions. This work presents a novel fusion step that addresses data corruptions and makes sensor fusion for 3D object detection more robust. Through extensive experiments, we demonstrate that our method performs on par with state-of-the-art approaches on normal data and outperforms them on misaligned data.
翻译:多模态传感器融合方法在3D目标检测领域正革新自动驾驶研究。然而,这些方法大多高度依赖密集的激光雷达数据和精确校准的传感器,而这在实际场景中往往难以实现。由于传感器标定误差、去标定或不同频率等原因,激光雷达与摄像头数据时常出现错位。此外,激光雷达数据可能因遮挡、硬件故障或天气条件而部分缺失。本研究提出一种新颖的融合步骤,旨在应对数据损坏问题,使面向3D目标检测的传感器融合更具鲁棒性。通过大量实验,我们证明所提方法在正常数据上的表现与前沿方法相当,在错位数据上则优于它们。