Autonomous mobility is emerging as a new disruptive mode of urban transportation for moving cargo and passengers. However, designing scalable autonomous fleet coordination schemes to accommodate fast-growing mobility systems is challenging primarily due to the increasing heterogeneity of the fleets, time-varying demand patterns, service area expansions, and communication limitations. We introduce the concept of partially observable advanced air mobility games to coordinate a fleet of aerial vehicles by accounting for the heterogeneity of the interacting agents and the self-interested nature inherent to commercial mobility fleets. To model the complex interactions among the agents and the observation uncertainty in the mobility networks, we propose a novel heterogeneous graph attention encoder-decoder (HetGAT Enc-Dec) neural network-based stochastic policy. We train the policy by leveraging deep multi-agent reinforcement learning, allowing decentralized decision-making for the agents using their local observations. Through extensive experimentation, we show that the learned policy generalizes to various fleet compositions, demand patterns, and observation topologies. Further, fleets operating under the HetGAT Enc-Dec policy outperform other state-of-the-art graph neural network policies by achieving the highest fleet reward and fulfillment ratios in on-demand mobility networks.
翻译:自主移动性正逐步成为城市货物与乘客运输领域的一种新兴颠覆性模式。然而,设计可扩展的自主机队协调方案以适应快速增长的移动系统面临重大挑战,主要原因在于机队异构性日益增强、需求模式随时间变化、服务区域扩大以及通信受限。我们引入部分可观测先进空中交通博弈概念,通过考虑交互智能体的异构性以及商业移动机队固有的自利特性,协调空中飞行器机队。为建模智能体间的复杂交互及移动网络中的观测不确定性,我们提出一种基于新型异构图注意力编码器-解码器(HetGAT Enc-Dec)神经网络的随机策略。该策略利用深度多智能体强化学习进行训练,使智能体能够基于局部观测实现分散式决策。通过大量实验表明,所学策略可泛化至多种机队组成、需求模式及观测拓扑结构。此外,采用HetGAT Enc-Dec策略的机队在按需移动网络中取得最高机队奖励与任务完成率,性能优于其他最先进的图神经网络策略。