In this paper, we propose a novel training strategy called SupFusion, which provides an auxiliary feature level supervision for effective LiDAR-Camera fusion and significantly boosts detection performance. Our strategy involves a data enhancement method named Polar Sampling, which densifies sparse objects and trains an assistant model to generate high-quality features as the supervision. These features are then used to train the LiDAR-Camera fusion model, where the fusion feature is optimized to simulate the generated high-quality features. Furthermore, we propose a simple yet effective deep fusion module, which contiguously gains superior performance compared with previous fusion methods with SupFusion strategy. In such a manner, our proposal shares the following advantages. Firstly, SupFusion introduces auxiliary feature-level supervision which could boost LiDAR-Camera detection performance without introducing extra inference costs. Secondly, the proposed deep fusion could continuously improve the detector's abilities. Our proposed SupFusion and deep fusion module is plug-and-play, we make extensive experiments to demonstrate its effectiveness. Specifically, we gain around 2% 3D mAP improvements on KITTI benchmark based on multiple LiDAR-Camera 3D detectors.
翻译:本文提出一种名为SupFusion的新型训练策略,该策略通过提供辅助特征级监督以实现高效的激光雷达-相机融合,显著提升了检测性能。我们设计了一种名为极坐标采样的数据增强方法,通过增密稀疏物体并训练辅助模型生成高质量特征作为监督。这些特征随后用于训练激光雷达-相机融合模型,其中的融合特征经过优化以模拟所生成的高质量特征。此外,我们提出一个简单而有效的深度融合模块,该模块在SupFusion策略下持续获得优于以往融合方法的性能。通过这种方式,本方法具有以下优势:首先,SupFusion引入了辅助特征级监督,可在不增加推理成本的前提下提升激光雷达-相机检测性能;其次,所提出的深度融合能够持续增强检测器的能力。本研究的SupFusion与深度融合模块具有即插即用特性,我们通过大量实验验证了其有效性。具体而言,基于多个激光雷达-相机三维检测器,我们在KITTI基准上获得了约2%的三维平均精度提升。