3D object detection from images, one of the fundamental and challenging problems in autonomous driving, has received increasing attention from both industry and academia in recent years. Benefiting from the rapid development of deep learning technologies, image-based 3D detection has achieved remarkable progress. Particularly, more than 200 works have studied this problem from 2015 to 2021, encompassing a broad spectrum of theories, algorithms, and applications. However, to date no recent survey exists to collect and organize this knowledge. In this paper, we fill this gap in the literature and provide the first comprehensive survey of this novel and continuously growing research field, summarizing the most commonly used pipelines for image-based 3D detection and deeply analyzing each of their components. Additionally, we also propose two new taxonomies to organize the state-of-the-art methods into different categories, with the intent of providing a more systematic review of existing methods and facilitating fair comparisons with future works. In retrospect of what has been achieved so far, we also analyze the current challenges in the field and discuss future directions for image-based 3D detection research.
翻译:图像三维目标检测是自动驾驶领域基础且极具挑战性的问题之一,近年来受到工业界与学术界的广泛关注。得益于深度学习技术的快速发展,基于图像的三维检测已取得显著进展。特别地,2015年至2021年间已有超过200项研究工作针对该问题展开,涵盖了广泛的理论、算法和应用。然而迄今为止,尚未有近期综述对该领域知识进行系统梳理。本文填补了文献空白,首次对这一新兴且持续发展的研究领域进行全面综述,总结了最常用的基于图像的三维检测技术管线,并深入分析了各组件功能。此外,我们提出两种新的分类体系,将现有先进方法归入不同类别,旨在更系统地回顾现有方法,并为未来工作的公平比较提供便利。在回顾现有成果的基础上,本文还分析了当前领域面临的挑战,并探讨了基于图像的三维检测研究的未来发展方向。