Indoor fire disasters pose severe challenges to autonomous search and rescue due to dense smoke, high temperatures, and dynamically evolving indoor environments. In such time-critical scenarios, multi-agent cooperative navigation is particularly useful, as it enables faster and broader exploration than single-agent approaches. However, existing multi-agent navigation systems are primarily vision-based and designed for benign indoor settings, leading to significant performance degradation under fire-driven dynamic conditions. In this paper, we present VULCAN, a multi-agent cooperative navigation framework based on multi-modal perception and vision-language models (VLMs), tailored for indoor fire disaster response. We extend the Habitat-Matterport3D benchmark by simulating physically realistic fire scenarios, including smoke diffusion, thermal hazards, and sensor degradation. We evaluate representative multi-agent cooperative navigation baselines under both normal and fire-driven environments. Our results reveal critical failure modes of existing methods in fire scenarios and underscore the necessity of robust perception and hazard-aware planning for reliable multi-agent search and rescue.
翻译:室内火灾灾害因浓烟、高温及动态演变的室内环境对自主搜救构成严峻挑战。在此类时间紧迫的场景中,多智能体协同导航尤为有效——相较于单智能体方法,它能实现更快速、更广泛的探索。然而,现有多智能体导航系统主要基于视觉设计,仅适用于理想室内环境,在火灾驱动的动态条件下性能显著下降。本文提出VULCAN——一种基于多模态感知与视觉-语言模型的多智能体协同导航框架,专为室内火灾灾害响应设计。我们通过模拟烟雾扩散、热危害及传感器退化等物理真实的火灾场景,扩展了Habitat-Matterport3D基准数据集。在正常环境和火灾驱动环境下,我们评估了代表性的多智能体协同导航基线方法。研究结果揭示了现有方法在火灾场景中的关键失效模式,并强调了鲁棒感知与危害感知规划对于实现可靠多智能体搜救的必要性。