Advanced Air Mobility (AAM) introduces a new, efficient mode of transportation with the use of vehicle autonomy and electrified aircraft to provide increasingly autonomous transportation between previously underserved markets. Safe and efficient navigation of low altitude aircraft through highly dense environments requires the integration of a multitude of complex observations, such as surveillance, knowledge of vehicle dynamics, and weather. The processing and reasoning on these observations pose challenges due to the various sources of uncertainty in the information while ensuring cooperation with a variable number of aircraft in the airspace. These challenges coupled with the requirement to make safety-critical decisions in real-time rule out the use of conventional separation assurance techniques. We present a decentralized reinforcement learning framework to provide autonomous self-separation capabilities within AAM corridors with the use of speed and vertical maneuvers. The problem is formulated as a Markov Decision Process and solved by developing a novel extension to the sample-efficient, off-policy soft actor-critic (SAC) algorithm. We introduce the use of attention networks for variable-length observation processing and a distributed computing architecture to achieve high training sample throughput as compared to existing approaches. A comprehensive numerical study shows that the proposed framework can ensure safe and efficient separation of aircraft in high density, dynamic environments with various sources of uncertainty.
翻译:先进空中交通(AAM)利用飞行器自主性与电动航空技术,为此前服务不足的市场提供日益自动化的运输方式,引入了一种新型高效交通模式。在高度密集环境中实现低空飞行器的安全高效导航,需要整合多种复杂观测数据,包括监视信息、飞行器动力学知识及气象条件。由于信息存在多种不确定性来源,且需确保与空域中数量多变的飞行器保持协同,对这些观测数据的处理与推理面临挑战。这些挑战与实时安全关键决策要求相结合,排除了传统间隔保障技术的可行性。本文提出一种去中心化强化学习框架,通过速度与垂直机动操作,在AAM走廊内实现自主自间隔能力。该问题被建模为马尔可夫决策过程,并通过开发样本高效、离线策略的软演员-评论家(SAC)算法的新型扩展加以解决。我们引入注意力网络处理可变长度观测序列,并采用分布式计算架构,与现有方法相比实现了更高的训练样本吞吐量。全面的数值研究表明,所提框架能在存在多种不确定性来源的高密度动态环境中,确保飞行器的安全高效间隔。