Autonomous driving is regarded as one of the most promising remedies to shield human beings from severe crashes. To this end, 3D object detection serves as the core basis of perception stack especially for the sake of path planning, motion prediction, and collision avoidance etc. Taking a quick glance at the progress we have made, we attribute challenges to visual appearance recovery in the absence of depth information from images, representation learning from partially occluded unstructured point clouds, and semantic alignments over heterogeneous features from cross modalities. Despite existing efforts, 3D object detection for autonomous driving is still in its infancy. Recently, a large body of literature have been investigated to address this 3D vision task. Nevertheless, few investigations have looked into collecting and structuring this growing knowledge. We therefore aim to fill this gap in a comprehensive survey, encompassing all the main concerns including sensors, datasets, performance metrics and the recent state-of-the-art detection methods, together with their pros and cons. Furthermore, we provide quantitative comparisons with the state of the art. A case study on fifteen selected representative methods is presented, involved with runtime analysis, error analysis, and robustness analysis. Finally, we provide concluding remarks after an in-depth analysis of the surveyed works and identify promising directions for future work.
翻译:自动驾驶被视为避免人类遭受严重碰撞事故最具前景的解决方案之一。为此,三维目标检测构成了感知系统的核心基础,尤其对于路径规划、运动预测和碰撞规避等任务至关重要。纵览当前进展,我们将其挑战归结为:图像缺乏深度信息时的视觉外观重建、部分遮挡非结构化点云的表征学习,以及跨模态异构特征的语义对齐。尽管已有诸多研究,面向自动驾驶的三维目标检测仍处于发展初期。近年来,大量文献致力于解决这一三维视觉任务,然而针对该领域知识的系统梳理与整合研究尚显不足。为此,本综述旨在全面填补这一空白,涵盖传感器配置、数据集、性能指标及当前最先进的检测方法(包括其优势与局限性)等核心议题。此外,我们提供了与前沿方法的定量比较研究,并通过对十五种代表性方法的案例分析,深入探讨了运行时性能、误差特性与鲁棒性表现。最后,在系统评述现有成果的基础上,我们提出总结性观点并指出未来研究的潜在方向。