Collaborative perception aims to mitigate the limitations of single-agent perception, such as occlusions, by facilitating data exchange among multiple agents. However, most current works consider a homogeneous scenario where all agents use identity sensors and perception models. In reality, heterogeneous agent types may continually emerge and inevitably face a domain gap when collaborating with existing agents. In this paper, we introduce a new open heterogeneous problem: how to accommodate continually emerging new heterogeneous agent types into collaborative perception, while ensuring high perception performance and low integration cost? To address this problem, we propose HEterogeneous ALliance (HEAL), a novel extensible collaborative perception framework. HEAL first establishes a unified feature space with initial agents via a novel multi-scale foreground-aware Pyramid Fusion network. When heterogeneous new agents emerge with previously unseen modalities or models, we align them to the established unified space with an innovative backward alignment. This step only involves individual training on the new agent type, thus presenting extremely low training costs and high extensibility. To enrich agents' data heterogeneity, we bring OPV2V-H, a new large-scale dataset with more diverse sensor types. Extensive experiments on OPV2V-H and DAIR-V2X datasets show that HEAL surpasses SOTA methods in performance while reducing the training parameters by 91.5% when integrating 3 new agent types. We further implement a comprehensive codebase at: https://github.com/yifanlu0227/HEAL
翻译:协同感知旨在通过多智能体间的数据交换,缓解单智能体感知中的遮挡等局限性。然而,现有研究大多考虑同构场景,即所有智能体使用相同的传感器和感知模型。现实场景中,异构智能体类型可能不断涌现,在与现有智能体协作时不可避免地面临领域鸿沟。本文提出一个新的开放异构问题:如何将持续涌现的新型异构智能体类型高效集成到协同感知中,同时确保高感知性能与低集成成本?针对该问题,我们提出HEterogeneous ALliance (HEAL)——一种新型可扩展协同感知框架。HEAL首先通过创新的多尺度前景感知金字塔融合网络,在初始智能体间建立统一特征空间。当出现携带未见过模态或模型的异构新智能体时,我们采用创新的反向对齐方法将其映射至已建立的统一空间。该步骤仅需对新智能体类型进行独立训练,因此具有极低的训练成本和高可扩展性。为丰富数据异构性,我们构建了OPV2V-H数据集——包含更多样化传感器类型的大规模新数据集。在OPV2V-H和DAIR-V2X数据集上的大量实验表明,HEAL在集成3种新智能体类型时,性能超越现有最优方法,同时训练参数量减少91.5%。完整代码库已开源至:https://github.com/yifanlu0227/HEAL