This paper introduces OccFusion, a straightforward and efficient sensor fusion framework for predicting 3D occupancy. A comprehensive understanding of 3D scenes is crucial in autonomous driving, and recent models for 3D semantic occupancy prediction have successfully addressed the challenge of describing real-world objects with varied shapes and classes. However, existing methods for 3D occupancy prediction heavily rely on surround-view camera images, making them susceptible to changes in lighting and weather conditions. By integrating features from additional sensors, such as lidar and surround view radars, our framework enhances the accuracy and robustness of occupancy prediction, resulting in top-tier performance on the nuScenes benchmark. Furthermore, extensive experiments conducted on the nuScenes dataset, including challenging night and rainy scenarios, confirm the superior performance of our sensor fusion strategy across various perception ranges. The code for this framework will be made available at https://github.com/DanielMing123/OCCFusion.
翻译:本文提出OccFusion——一种简洁高效的多传感器融合框架,用于预测3D占据状态。全面理解3D场景是自动驾驶的关键,近期针对3D语义占据预测的模型已成功解决了描述现实世界中形态与类别各异物体的挑战。然而现有3D占据预测方法严重依赖环视相机图像,易受光照和天气条件变化影响。通过融合激光雷达、环视雷达等辅助传感器特征,本框架提升了占据预测的准确性与鲁棒性,在nuScenes基准测试中取得了顶尖性能。此外,在nuScenes数据集上开展的大量实验(涵盖具有挑战性的夜间和雨天场景)证实,本传感器融合策略在不同感知范围内均展现出优越性能。本框架代码将开源至https://github.com/DanielMing123/OCCFusion。