Although robot-to-robot (R2R) communication improves indoor scene understanding beyond what a single robot can achieve, R2R alone cannot overcome partial observability without substantial exploration overhead or scaling team size. In contrast, many indoor environments already include low-cost Internet of Things (IoT) sensors (e.g., cameras) that provide persistent, building-wide context beyond onboard perception. We therefore introduce IndoorR2X, the first benchmark and simulation framework for Large Language Model (LLM)-driven multi-robot task planning with Robot-to-Everything (R2X) perception and communication in indoor environments. IndoorR2X integrates observations from mobile robots and static IoT devices to construct a global semantic state that supports scalable scene understanding, reduces redundant exploration, and enables high-level coordination through LLM-based planning. IndoorR2X provides configurable simulation environments, sensor layouts, robot teams, and task suites to systematically evaluate high-level semantic coordination strategies. Extensive experiments across diverse settings demonstrate that IoT-augmented world modeling improves multi-robot efficiency and reliability, and we highlight key insights and failure modes for advancing LLM-based collaboration between robot teams and indoor IoT sensors.
翻译:尽管机器人间通信(R2R)能超越单机器人实现更优的室内场景理解,但仅靠R2R仍无法克服部分可观测性问题——要么需要大量探索开销,要么必须扩大团队规模。相比之下,许多室内环境已部署低成本的物联网传感器(如摄像头),能提供超越机载感知的持续性建筑级上下文。为此,我们提出IndoorR2X——首个面向室内环境、基于大语言模型的机器人-万物(R2X)感知与通信的多机器人任务规划基准与仿真框架。IndoorR2X融合移动机器人观测与静态物联网设备数据,构建全局语义状态,支持可扩展场景理解,减少冗余探索,并通过大模型规划实现高层级协调。该框架提供可配置的仿真环境、传感器布局、机器人团队及任务套件,用于系统评估高层级语义协调策略。多样化场景的大量实验表明,物联网增强的世界建模能提升多机器人的效率与可靠性,我们同时揭示了推动基于大模型的机器人团队与室内物联网传感器协作的关键见解与失败模式。