3D object detection with point clouds and images plays an important role in perception tasks such as autonomous driving. Current methods show great performance on detection and pose estimation of standard-shaped vehicles but lack behind on more complex shapes as e.g. semi-trailer truck combinations. Determining the shape and motion of those special vehicles accurately is crucial in yard operation and maneuvering and industrial automation applications. This work introduces several new methods to improve and measure the performance for such classes. State-of-the-art methods are based on predefined anchor grids or heatmaps for ground truth targets. However, the underlying representations do not take the shape of different sized objects into account. Our main contribution, AdaptiveShape, uses shape aware anchor distributions and heatmaps to improve the detection capabilities. For large vehicles we achieve +10.9% AP in comparison to current shape agnostic methods. Furthermore we introduce a new fast LiDAR-camera fusion. It is based on 2D bounding box camera detections which are available in many processing pipelines. This fusion method does not rely on perfectly calibrated or temporally synchronized systems and is therefore applicable to a broad range of robotic applications. We extend a standard point pillar network to account for temporal data and improve learning of complex object movements. In addition we extended a ground truth augmentation to use grouped object pairs to further improve truck AP by +2.2% compared to conventional augmentation.
翻译:基于点云与图像的三维目标检测在自动驾驶等感知任务中扮演重要角色。现有方法对标准形状车辆的检测与位姿估计表现出色,但在半挂车组合等复杂形状目标的处理上仍存在不足。精确确定这些特殊车辆的形状与运动状态,对堆场作业、机动操控及工业自动化应用至关重要。本文提出多种新方法以提升对此类目标的检测性能与评估精度。当前主流方法依赖预定义的锚点网格或热力图作为真值目标,但其底层表征未考虑不同尺寸目标的形状差异。我们的核心贡献AdaptiveShape通过引入形状感知的锚点分布与热力图增强检测能力。相较于现存的形状无关方法,该方法在大型车辆检测上实现了+10.9%的平均精度提升。此外,我们提出一种新型快速激光雷达-相机融合方法,该方法基于多数处理流水线中已有的二维包围框相机检测结果,无需完美标定或时间同步系统,因此可广泛应用于各类机器人场景。我们扩展标准点柱网络以处理时序数据并改善复杂物体运动的学习效果,同时通过分组目标对增强真值数据扩充,相较传统扩充方法进一步提升拖车类目标平均精度2.2%。