Recently, 3D object detection has attracted significant attention and achieved continuous improvement in real road scenarios. The environmental information is collected from a single sensor or multi-sensor fusion to detect interested objects. However, most of the current 3D object detection approaches focus on developing advanced network architectures to improve the detection precision of the object rather than considering the dynamic driving scenes, where data collected from sensors equipped in the vehicle contain various perturbation features. As a result, existing work cannot still tackle the perturbation issue. In order to solve this problem, we propose a group equivariant bird's eye view network (GeqBevNet) based on the group equivariant theory, which introduces the concept of group equivariant into the BEV fusion object detection network. The group equivariant network is embedded into the fused BEV feature map to facilitate the BEV-level rotational equivariant feature extraction, thus leading to lower average orientation error. In order to demonstrate the effectiveness of the GeqBevNet, the network is verified on the nuScenes validation dataset in which mAOE can be decreased to 0.325. Experimental results demonstrate that GeqBevNet can extract more rotational equivariant features in the 3D object detection of the actual road scene and improve the performance of object orientation prediction.
翻译:近年来,三维物体检测在真实道路场景中受到广泛关注并取得持续进步。环境信息通过单传感器或多传感器融合采集,以检测感兴趣的目标。然而,当前大多数三维物体检测方法专注于设计先进的网络架构以提高物体检测精度,却未考虑动态驾驶场景——车辆搭载传感器采集的数据包含多种扰动特征。因此,现有工作仍无法有效解决扰动问题。针对这一挑战,我们基于群等变理论提出群等变鸟瞰视角网络(GeqBevNet),首次将群等变概念引入BEV融合目标检测网络。通过将群等变网络嵌入融合BEV特征图,促进BEV层面的旋转等变特征提取,从而降低平均方位角误差。为验证GeqBevNet的有效性,该网络在nuScenes验证数据集上进行了测试,其mAOE指标可降至0.325。实验结果表明,GeqBevNet能在实际道路场景的三维物体检测中提取更多旋转等变特征,显著提升物体朝向预测性能。