Multi-Agent Path Finding (MAPF) in crowded environments presents a challenging problem in motion planning, aiming to find collision-free paths for all agents in the system. MAPF finds a wide range of applications in various domains, including aerial swarms, autonomous warehouse robotics, and self-driving vehicles. The current approaches for MAPF can be broadly categorized into two main categories: centralized and decentralized planning. Centralized planning suffers from the curse of dimensionality and thus does not scale well in large and complex environments. On the other hand, decentralized planning enables agents to engage in real-time path planning within a partially observable environment, demonstrating implicit coordination. However, they suffer from slow convergence and performance degradation in dense environments. In this paper, we introduce CRAMP, a crowd-aware decentralized approach to address this problem by leveraging reinforcement learning guided by a boosted curriculum-based training strategy. We test CRAMP on simulated environments and demonstrate that our method outperforms the state-of-the-art decentralized methods for MAPF on various metrics. CRAMP improves the solution quality up to 58% measured in makespan and collision count, and up to 5% in success rate in comparison to previous methods.
翻译:多智能体路径规划(MAPF)在拥挤环境中面临运动规划中的挑战性问题,旨在为系统中所有智能体寻找无碰撞路径。MAPF在诸多领域具有广泛应用,包括空中编队、自主仓库机器人及自动驾驶车辆。当前MAPF方法可大致分为集中式与分散式规划两类。集中式规划受维度灾难影响,难以在大型复杂环境中有效扩展;而分散式规划虽能使智能体在部分可观测环境中实时规划路径并展现隐式协调能力,但在密集环境中存在收敛缓慢和性能退化问题。本文提出CRAMP——一种面向密集人群的分散式方法,通过结合增强课程训练策略的强化学习解决该问题。我们在模拟环境中测试CRAMP,结果表明该方法在多项指标上优于现有最优的MAPF分散式方法。与以往方法相比,CRAMP在完工时间与碰撞次数方面将解质量提升高达58%,成功率提升达5%。