3D object detection from images, one of the fundamental and challenging problems in autonomous driving, has received increasing attention from both industry and academia in recent years. Benefiting from the rapid development of deep learning technologies, image-based 3D detection has achieved remarkable progress. Particularly, more than 200 works have studied this problem from 2015 to 2021, encompassing a broad spectrum of theories, algorithms, and applications. However, to date no recent survey exists to collect and organize this knowledge. In this paper, we fill this gap in the literature and provide the first comprehensive survey of this novel and continuously growing research field, summarizing the most commonly used pipelines for image-based 3D detection and deeply analyzing each of their components. Additionally, we also propose two new taxonomies to organize the state-of-the-art methods into different categories, with the intent of providing a more systematic review of existing methods and facilitating fair comparisons with future works. In retrospect of what has been achieved so far, we also analyze the current challenges in the field and discuss future directions for image-based 3D detection research.
翻译:图像三维目标检测是自动驾驶中的基础难题之一,近年来受到工业界和学术界的广泛关注。得益于深度学习技术的快速发展,基于图像的三维检测取得了显著进展。特别是从2015年至2021年,已有超过200项研究涉及该问题,涵盖理论、算法及应用等多个层面。然而,目前尚未有最新综述系统整理这些知识。本文填补了这一文献空白,首次对该新兴且持续发展的研究领域进行全面综述,总结了图像三维检测最常用的技术流水线,并深入分析了各组成部分。此外,我们提出两种新的分类体系,将现有方法划分为不同类别,旨在更系统地梳理现有成果,并为未来研究的公平比较提供便利。在回顾已有进展的基础上,我们还分析了当前领域的挑战,并探讨了基于图像的三维检测研究的未来方向。