3D occupancy perception technology aims to observe and understand dense 3D environments for autonomous vehicles. Owing to its comprehensive perception capability, this technology is emerging as a trend in autonomous driving perception systems, and is attracting significant attention from both industry and academia. Similar to traditional bird's-eye view (BEV) perception, 3D occupancy perception has the nature of multi-source input and the necessity for information fusion. However, the difference is that it captures vertical structures that are ignored by 2D BEV. In this survey, we review the most recent works on 3D occupancy perception, and provide in-depth analyses of methodologies with various input modalities. Specifically, we summarize general network pipelines, highlight information fusion techniques, and discuss effective network training. We evaluate and analyze the occupancy perception performance of the state-of-the-art on the most popular datasets. Furthermore, challenges and future research directions are discussed. We hope this report will inspire the community and encourage more research work on 3D occupancy perception. A comprehensive list of studies in this survey is available in an active repository that continuously collects the latest work: https://github.com/HuaiyuanXu/3D-Occupancy-Perception.
翻译:三维占据感知技术旨在观测和理解自动驾驶车辆所处的稠密三维环境。凭借其全面的感知能力,该技术正成为自动驾驶感知系统的发展趋势,并受到工业界和学术界的广泛关注。与传统鸟瞰视角(BEV)感知相似,三维占据感知具有多源输入特性以及信息融合的必要性。然而,其区别在于能够捕获二维BEV所忽略的垂直结构。本综述回顾了三维占据感知领域的最新研究成果,并对不同输入模态的方法论进行了深入分析。具体而言,我们总结了通用网络流程、重点阐述了信息融合技术,并探讨了网络训练的有效性。我们评估和分析了当前最优方法在最流行数据集上的占据感知性能。此外,还讨论了挑战与未来研究方向。希望本报告能启发学界,并推动更多关于三维占据感知的研究工作。本综述所涵盖研究的完整列表可在一个持续更新最新工作的活跃仓库中获取:https://github.com/HuaiyuanXu/3D-Occupancy-Perception。