Perception for automated driving is largely based on onboard environmental sensors, such as cameras and radar, which are cost-effective but limited by line-of-sight and field-of-view constraints. These inherent limitations may cause onboard perception to fail under occlusions or poor visibility conditions. In parallel, cooperative awareness via vehicle-to-everything (V2X) communication is becoming increasingly available, enabling vehicles and infrastructure to share their own state as object-level information that complements onboard perception. In this work, we study how such V2X information can be integrated into 3D object detection and how robust the resulting system is to realistic V2X imperfections. Using the nuScenes dataset, we emulate object-level cooperative awareness messages from ground truth, injecting controlled noise and object dropout to mimic real-world conditions such as latency, localization errors, and low V2X penetration rates. We convert these messages into a dedicated bird's-eye view (BEV) input and fuse them into a BEVFusion-style detector. Our results demonstrate that while object-level cooperative information can substantially improve detection performance, achieving an NDS of 0.80 under favorable conditions, models trained on idealized data become fragile and over-reliant on V2X. Conversely, our proposed noise-aware training strategy, coupled with explicit confidence encoding, enhances robustness, maintaining performance gains even under severe noise and reduced V2X penetration.
翻译:自动驾驶的感知主要依赖于车载环境传感器(如摄像头和雷达),这些传感器成本低廉,但受限于视距和视野范围约束。这些固有缺陷可能导致车载感知在遮挡或低能见度条件下失效。与此同时,通过车联网(V2X)通信实现的协同感知正日益普及,使车辆与基础设施能够以对象级信息的形式共享自身状态,从而补充车载感知。本研究探讨如何将此类V2X信息整合至三维目标检测中,并评估该系统对实际V2X非理想特性的鲁棒性。我们基于nuScenes数据集,利用真实标注生成对象级协同感知信息,通过注入可控噪声和对象丢失来模拟现实场景中的时延、定位误差及低V2X渗透率。我们将这些信息转化为专用鸟瞰图(BEV)输入,并融合至BEVFusion风格检测器中。结果表明:对象级协同信息在理想条件下可显著提升检测性能(NDS达0.80),但基于理想化数据训练的模型会变得脆弱且过度依赖V2X。相反,我们提出的噪声感知训练策略结合显式置信度编码,增强了鲁棒性,即便在严重噪声和低V2X渗透率下仍能保持性能增益。