Multi-agent pathfinding (MAPF) remains a critical problem in robotics and autonomous systems, where agents must navigate shared spaces efficiently while avoiding conflicts. Traditional centralized algorithms with global information provide high-quality solutions but scale poorly in large-scale scenarios due to the combinatorial explosion of conflicts. Conversely, distributed approaches that have local information, particularly learning-based methods, offer better scalability by operating with relaxed information availability, yet often at the cost of solution quality. In realistic deployments, information is a constrained resource: broadcasting full agent states and goals can raise privacy concerns, strain limited bandwidth, and require extra sensing and communication hardware, increasing cost and energy use. We focus on the core question of how MAPF can be solved with minimal inter-agent information sharing while preserving solution feasibility. To this end, we present an information-centric formulation of the MAPF problem and introduce a hybrid framework, IO-MAPF, that integrates decentralized path planning with a lightweight centralized coordinator. In this framework, agents use reinforcement learning (RL) to plan independently, while the central coordinator provides minimal, targeted signals, such as static conflict-cell indicators or short conflict trajectories, that are dynamically shared to support efficient conflict resolution. We introduce an Information Units (IU) metric to quantify information use and show that our alert-driven design achieves 2x to 23x reduction in information sharing, compared to the state-of-the-art algorithms, while maintaining high success rates, demonstrating that reliable MAPF is achievable under strongly information-restricted, privacy-preserving conditions. We demonstrate the effectiveness of our algorithm using simulation and hardware experiments.
翻译:多智能体路径规划(MAPF)在机器人学和自主系统中仍是一个关键问题,智能体必须在共享空间中高效导航并避免冲突。依赖全局信息的传统集中式算法虽能提供高质量解,但由于冲突的组合爆炸问题,在大规模场景中扩展性较差。相反,基于局部信息的分布式方法(特别是基于学习的方法)通过放宽信息可用性要求获得了更好的可扩展性,但这往往以牺牲解质量为代价。在实际部署中,信息是一种受限资源:广播完整的智能体状态和目标可能引发隐私担忧、挤占有限带宽,并需要额外的传感与通信硬件,从而增加成本和能耗。我们聚焦于一个核心问题:如何在保持解可行性的前提下,以最少的智能体间信息共享解决MAPF问题。为此,我们提出了一个以信息为中心的MAPF问题形式化描述,并引入了一种混合框架——IO-MAPF,该框架将分散式路径规划与轻量级集中协调器相结合。在此框架中,智能体使用强化学习(RL)进行独立规划,而中央协调器则提供极简的定向信号(例如静态冲突单元指示器或短冲突轨迹),这些信号被动态共享以支持高效的冲突消解。我们引入了信息单位(IU)度量来量化信息使用量,并证明相较于最先进算法,我们的警报驱动设计能将信息共享量降低2至23倍,同时保持高成功率,这表明在严格信息受限且保护隐私的条件下,可靠的MAPF是可能实现的。我们通过仿真与硬件实验验证了算法的有效性。