Vision-based 3D Detection task is fundamental task for the perception of an autonomous driving system, which has peaked interest amongst many researchers and autonomous driving engineers. However achieving a rather good 3D BEV (Bird's Eye View) performance is not an easy task using 2D sensor input-data with cameras. In this paper we provide a literature survey for the existing Vision Based 3D detection methods, focused on autonomous driving. We have made detailed analysis of over $60$ papers leveraging Vision BEV detections approaches and highlighted different sub-groups for detailed understanding of common trends. Moreover, we have highlighted how the literature and industry trend have moved towards surround-view image based methods and note down thoughts on what special cases this method addresses. In conclusion, we provoke thoughts of 3D Vision techniques for future research based on shortcomings of the current techniques including the direction of collaborative perception.
翻译:基于视觉的三维检测任务是自动驾驶系统感知中的基础任务,引起了众多研究人员和自动驾驶工程师的广泛兴趣。然而,利用二维传感器输入数据(相机)实现良好的三维BEV(鸟瞰视图)性能并非易事。本文对现有基于视觉的三维检测方法进行了文献综述,重点聚焦于自动驾驶领域。我们对超过60篇利用视觉BEV检测方法的论文进行了详细分析,并划分了不同子类,以便深入理解常见趋势。此外,我们强调了文献和工业界趋势如何转向基于环视图像的方法,并针对该方法解决的特定场景提出了见解。最后,基于现有技术的不足(包括协同感知方向),我们对未来研究的3D视觉技术提出了思考。