Recently, 3D object detection has attracted significant attention and achieved continuous improvement in real road scenarios. The environmental information is collected from a single sensor or multi-sensor fusion to detect interested objects. However, most of the current 3D object detection approaches focus on developing advanced network architectures to improve the detection precision of the object rather than considering the dynamic driving scenes, where data collected from sensors equipped in the vehicle contain various perturbation features. As a result, existing work cannot still tackle the perturbation issue. In order to solve this problem, we propose a group equivariant bird's eye view network (GeqBevNet) based on the group equivariant theory, which introduces the concept of group equivariant into the BEV fusion object detection network. The group equivariant network is embedded into the fused BEV feature map to facilitate the BEV-level rotational equivariant feature extraction, thus leading to lower average orientation error. In order to demonstrate the effectiveness of the GeqBevNet, the network is verified on the nuScenes validation dataset in which mAOE can be decreased to 0.325. Experimental results demonstrate that GeqBevNet can extract more rotational equivariant features in the 3D object detection of the actual road scene and improve the performance of object orientation prediction.
翻译:近年来,3D目标检测在真实道路场景中受到广泛关注并持续取得进步。环境信息通过单传感器或多传感器融合采集,以检测感兴趣的目标。然而,当前大多数3D目标检测方法专注于开发先进的网络架构以提高目标检测精度,却未考虑动态驾驶场景中车载传感器采集数据包含多种扰动特征的问题。因此,现有工作仍无法有效解决这一扰动问题。为解决该问题,我们基于群等变理论提出了一种群等变鸟瞰视角网络(GeqBevNet),将群等变概念引入BEV融合目标检测网络。该群等变网络嵌入到融合后的BEV特征图中,以促进BEV层面的旋转等变特征提取,从而降低平均方向误差。为验证GeqBevNet的有效性,我们在nuScenes验证数据集上进行了测试,其中mAOE可降至0.325。实验结果表明,GeqBevNet能够在实际道路场景的3D目标检测中提取更多旋转等变特征,并提升目标方向预测性能。