This paper introduces OccFusion, a straightforward and efficient sensor fusion framework for predicting 3D occupancy. A comprehensive understanding of 3D scenes is crucial in autonomous driving, and recent models for 3D semantic occupancy prediction have successfully addressed the challenge of describing real-world objects with varied shapes and classes. However, existing methods for 3D occupancy prediction heavily rely on surround-view camera images, making them susceptible to changes in lighting and weather conditions. By integrating features from additional sensors, such as lidar and surround view radars, our framework enhances the accuracy and robustness of occupancy prediction, resulting in top-tier performance on the nuScenes benchmark. Furthermore, extensive experiments conducted on the nuScenes dataset, including challenging night and rainy scenarios, confirm the superior performance of our sensor fusion strategy across various perception ranges. The code for this framework will be made available at https://github.com/DanielMing123/OCCFusion.
翻译:本文提出OccFusion,一个用于预测3D占用的直接高效传感器融合框架。全面理解3D场景在自动驾驶中至关重要,而现有3D语义占用预测模型已成功解决了描述具有不同形状和类别的真实世界物体的挑战。然而,现有3D占用预测方法严重依赖环视相机图像,使其易受光照和天气条件变化的影响。通过整合来自激光雷达、环视雷达等额外传感器的特征,本框架增强了占用预测的准确性和鲁棒性,在nuScenes基准上实现了顶级性能。此外,在nuScenes数据集上开展的广泛实验,包括具有挑战性的夜间和雨天下场景,验证了我们传感器融合策略在不同感知范围中的优越性能。本框架代码将发布于https://github.com/DanielMing123/OCCFusion。