Vision-based 3D Detection task is fundamental task for the perception of an autonomous driving system, which has peaked interest amongst many researchers and autonomous driving engineers. However achieving a rather good 3D BEV (Bird's Eye View) performance is not an easy task using 2D sensor input-data with cameras. In this paper we provide a literature survey for the existing Vision Based 3D detection methods, focused on autonomous driving. We have made detailed analysis of over $60$ papers leveraging Vision BEV detections approaches and highlighted different sub-groups for detailed understanding of common trends. Moreover, we have highlighted how the literature and industry trend have moved towards surround-view image based methods and note down thoughts on what special cases this method addresses. In conclusion, we provoke thoughts of 3D Vision techniques for future research based on shortcomings of the current techniques including the direction of collaborative perception.
翻译:基于视觉的三维检测任务是自动驾驶系统感知中的基础任务,已引起众多研究人员和自动驾驶工程师的浓厚兴趣。然而,利用二维传感器输入数据(摄像头)实现较好的三维BEV(鸟瞰视角)性能并非易事。本文对基于现有视觉三维检测方法进行了文献综述,重点关注自动驾驶领域。我们详细分析了超过60篇利用视觉BEV检测方法的论文,并划分了不同子类以深入理解常见趋势。此外,我们强调了文献和工业界趋势如何转向基于环视图像的方法,并阐述了该方法解决的特殊案例。最后,基于当前技术的不足(包括协同感知的方向),我们提出了未来三维视觉技术研究方向的思考。