Robust Asynchronous Collaborative 3D Detection via Bird's Eye View Flow

By facilitating communication among multiple agents, collaborative perception can substantially boost each agent's perception ability. However, temporal asynchrony among agents is inevitable in real-world due to communication delays, interruptions, and clock misalignments. This issue causes information mismatch during multi-agent fusion, seriously shaking the foundation of collaboration. To address this issue, we propose CoBEVFlow, an asynchrony-robust collaborative 3D perception system based on bird's eye view (BEV) flow. The key intuition of CoBEVFlow is to compensate motions to align asynchronous collaboration messages sent by multiple agents. To model the motion in a scene, we propose BEV flow, which is a collection of the motion vector corresponding to each spatial location. Based on BEV flow, asynchronous perceptual features can be reassigned to appropriate positions, mitigating the impact of asynchrony. CoBEVFlow has two advantages: (i) CoBEVFlow can handle asynchronous collaboration messages sent at irregular, continuous time stamps without discretization; and (ii) with BEV flow, CoBEVFlow only transports the original perceptual features, instead of generating new perceptual features, avoiding additional noises. To validate CoBEVFlow's efficacy, we create IRregular V2V(IRV2V), the first synthetic collaborative perception dataset with various temporal asynchronies that simulate different real-world scenarios. Extensive experiments conducted on both IRV2V and the real-world dataset DAIR-V2X show that CoBEVFlow consistently outperforms other baselines and is robust in extremely asynchronous settings. The code will be released.

翻译：通过促进多智能体间的通信，协同感知能够显著增强每个智能体的感知能力。然而，由于通信延迟、中断和时钟偏差，实际环境中智能体间的时序异步不可避免。该问题会导致多智能体融合过程中的信息错配，严重动摇协作基础。针对这一问题，我们提出CoBEVFlow——一种基于鸟瞰流（BEV flow）的抗异步协同3D感知系统。CoBEVFlow的核心思路是通过运动补偿来对齐多智能体发送的异步协同消息。为建模场景中的运动，我们提出鸟瞰流，即每个空间位置对应的运动矢量集合。基于鸟瞰流，异步感知特征可被重新分配至合适位置，从而缓解异步的影响。CoBEVFlow具有两大优势：(i) 能够处理以非规则、连续时间戳发送的异步协同消息，无需离散化；(ii) 通过鸟瞰流仅对原始感知特征进行迁移，而非生成新特征，从而避免引入额外噪声。为验证CoBEVFlow的有效性，我们创建了首个具有多种时序异步性（模拟不同现实场景）的合成协同感知数据集IRregular V2V（IRV2V）。在IRV2V和真实数据集DAIR-V2X上的大量实验表明，CoBEVFlow始终优于其他基线方法，且在极端异步场景中保持鲁棒性。代码将开源。