3D object detection from images, one of the fundamental and challenging problems in autonomous driving, has received increasing attention from both industry and academia in recent years. Benefiting from the rapid development of deep learning technologies, image-based 3D detection has achieved remarkable progress. Particularly, more than 200 works have studied this problem from 2015 to 2021, encompassing a broad spectrum of theories, algorithms, and applications. However, to date no recent survey exists to collect and organize this knowledge. In this paper, we fill this gap in the literature and provide the first comprehensive survey of this novel and continuously growing research field, summarizing the most commonly used pipelines for image-based 3D detection and deeply analyzing each of their components. Additionally, we also propose two new taxonomies to organize the state-of-the-art methods into different categories, with the intent of providing a more systematic review of existing methods and facilitating fair comparisons with future works. In retrospect of what has been achieved so far, we also analyze the current challenges in the field and discuss future directions for image-based 3D detection research.
翻译:基于图像的3D目标检测是自动驾驶领域一项基础且具有挑战性的问题,近年来受到工业界和学术界的广泛关注。得益于深度学习技术的快速发展,基于图像的3D检测已取得显著进展。值得注意的是,从2015年至2021年,已有超过200项研究工作聚焦于该问题,涵盖了广泛的理论、算法及应用。然而,目前尚缺乏最新的综述来系统整理这些知识。本文填补了这一文献空白,对该新兴且持续发展的研究领域进行了首次全面综述,总结了基于图像3D检测最常用的流水线(pipeline),并深入分析了各组成部分。此外,我们提出了两种新的分类法,将现有先进方法划分为不同类别,旨在更系统地回顾现有方法,并促进与未来工作的公平比较。在回顾现有成果的同时,我们分析了当前领域面临的挑战,并探讨了基于图像3D检测研究的未来方向。