In recent years, great progress has been made in the Lift-Splat-Shot-based (LSS-based) 3D object detection method, which converts features of 2D camera view and 3D lidar view to Bird's-Eye-View (BEV) for feature fusion. However, inaccurate depth estimation (e.g. the 'depth jump' problem) is an obstacle to develop LSS-based methods. To alleviate the 'depth jump' problem, we proposed Edge-Aware Bird's-Eye-View (EA-BEV) projector. By coupling proposed edge-aware depth fusion module and depth estimate module, the proposed EA-BEV projector solves the problem and enforces refined supervision on depth. Besides, we propose sparse depth supervision and gradient edge depth supervision, for constraining learning on global depth and local marginal depth information. Our EA-BEV projector is a plug-and-play module for any LSS-based 3D object detection models, and effectively improves the baseline performance. We demonstrate the effectiveness on the nuScenes benchmark. On the nuScenes 3D object detection validation dataset, our proposed EA-BEV projector can boost several state-of-the-art LLS-based baselines on nuScenes 3D object detection benchmark and nuScenes BEV map segmentation benchmark with negligible increment of inference time.
翻译:近年来,基于Lift-Splat-Shot(LSS)的3D目标检测方法取得了重要进展,该方法将2D相机视角与3D激光雷达视角的特征转换至鸟瞰视角(BEV)进行特征融合。然而,不精确的深度估计(如“深度跳跃”问题)成为制约LSS方法发展的障碍。为缓解“深度跳跃”问题,我们提出边缘感知鸟瞰(EA-BEV)投影器。通过耦合所提出的边缘感知深度融合模块与深度估计模块,EA-BEV投影器解决了该问题并对深度施加了精细化监督。此外,我们提出稀疏深度监督与梯度边缘深度监督,以约束全局深度与局部边缘深度信息的学习。本EA-BEV投影器可作为即插即用模块应用于任何基于LSS的3D目标检测模型,有效提升基线性能。我们在nuScenes基准上验证了其有效性。在nuScenes 3D目标检测验证数据集上,所提出的EA-BEV投影器能以可忽略的推理时间增量,显著提升多个基于LSS的主流基线模型在nuScenes 3D目标检测基准与nuScenes BEV地图分割基准上的性能。