This paper presents a deep reinforcement learning solution for optimizing multi-UAV cell-association decisions and their moving velocity on a 3D aerial highway. The objective is to enhance transportation and communication performance, including collision avoidance, connectivity, and handovers. The problem is formulated as a Markov decision process (MDP) with UAVs' states defined by velocities and communication data rates. We propose a neural architecture with a shared decision module and multiple network branches, each dedicated to a specific action dimension in a 2D transportation-communication space. This design efficiently handles the multi-dimensional action space, allowing independence for individual action dimensions. We introduce two models, Branching Dueling Q-Network (BDQ) and Branching Dueling Double Deep Q-Network (Dueling DDQN), to demonstrate the approach. Simulation results show a significant improvement of 18.32% compared to existing benchmarks.
翻译:本文提出了一种深度强化学习解决方案,用于优化三维空中高速公路上多无人机的小区关联决策及其移动速度。目标是在避免碰撞、保障连接性和切换性能的同时,提升交通运输与通信性能。该问题被建模为马尔可夫决策过程(MDP),其中无人机的状态由速度和通信数据速率定义。我们提出了一种神经网络架构,包含共享决策模块和多个网络分支,每个分支专注于二维交通-通信空间中的特定动作维度。该设计有效处理了多维动作空间,允许各动作维度保持独立性。我们引入了分支决斗Q网络(BDQ)和分支决斗双重深度Q网络(Dueling DDQN)两种模型来验证该方法。仿真结果表明,与现有基准相比,性能显著提升了18.32%。