Collaborative perception in automated vehicles leverages the exchange of information between agents, aiming to elevate perception results. Previous camera-based collaborative 3D perception methods typically employ 3D bounding boxes or bird's eye views as representations of the environment. However, these approaches fall short in offering a comprehensive 3D environmental prediction. To bridge this gap, we introduce the first method for collaborative 3D semantic occupancy prediction. Particularly, it improves local 3D semantic occupancy predictions by hybrid fusion of (i) semantic and occupancy task features, and (ii) compressed orthogonal attention features shared between vehicles. Additionally, due to the lack of a collaborative perception dataset designed for semantic occupancy prediction, we augment a current collaborative perception dataset to include 3D collaborative semantic occupancy labels for a more robust evaluation. The experimental findings highlight that: (i) our collaborative semantic occupancy predictions excel above the results from single vehicles by over 30%, and (ii) models anchored on semantic occupancy outpace state-of-the-art collaborative 3D detection techniques in subsequent perception applications, showcasing enhanced accuracy and enriched semantic-awareness in road environments.
翻译:网联自动驾驶车辆中的协同感知通过代理之间的信息交换提升感知效果。现有基于摄像头的协同3D感知方法通常采用3D边界框或鸟瞰图作为环境表征,但这类方法无法提供全面的3D环境预测。为填补这一空白,我们首次提出面向协同3D语义占用预测的方法。该方法通过混合融合以下两类特征来改进局部3D语义占用预测:(i)语义特征与占用任务特征,(ii)车辆间共享的压缩正交注意力特征。此外,鉴于现有协同感知数据集缺乏针对语义占用预测的设计,我们对当前协同感知数据集进行扩展,加入3D协同语义占用标签以增强评估鲁棒性。实验结果表明:(i)我们的协同式语义占用预测结果相较于单车方案提升超过30%,(ii)基于语义占用的模型在后续感知应用中显著优于现有最先进的协同3D检测技术,展现了更强的道路环境感知精度与语义感知能力。