In this paper, we employ multiple UAVs to accelerate data transmissions from ground users (GUs) to a remote base station (BS) via the UAVs' relay communications. The UAVs' intermittent information exchanges typically result in delays in acquiring the complete system state and hinder their effective collaboration. To maximize the overall throughput, we first propose a delay-tolerant multi-agent deep reinforcement learning (MADRL) algorithm that integrates a delay-penalized reward to encourage information sharing among UAVs, while jointly optimizing the UAVs' trajectory planning, network formation, and transmission control strategies. Additionally, considering information loss due to unreliable channel conditions, we further propose a spatio-temporal attention based prediction approach to recover the lost information and enhance each UAV's awareness of the network state. These two designs are envisioned to enhance the network capacity in UAV-assisted wireless networks with limited communications. The simulation results reveal that our new approach achieves over 50\% reduction in information delay and 75% throughput gain compared to the conventional MADRL. Interestingly, it is shown that improving the UAVs' information sharing will not sacrifice the network capacity. Instead, it significantly improves the learning performance and throughput simultaneously. It is also effective in reducing the need for UAVs' information exchange and thus fostering practical deployment of MADRL in UAV-assisted wireless networks.
翻译:本文利用多架无人机通过中继通信加速地面用户向远程基站的数传过程。由于无人机间信息交互存在间歇性,导致完整系统状态的获取存在延迟,从而阻碍其有效协同。为最大化总吞吐量,本文首先提出一种容忍延迟的多智能体深度强化学习算法,该算法通过引入延迟惩罚奖励机制促进无人机间的信息共享,同时联合优化无人机的轨迹规划、网络拓扑形成及传输控制策略。进一步考虑不可靠信道条件导致的信息损失,我们提出基于时空注意力的预测方法恢复丢失信息,增强各无人机对网络状态的感知能力。这两种设计旨在提升有限通信条件下无人机辅助无线网络的容量。仿真结果表明,与传统多智能体深度强化学习相比,本方法可降低超过50%的信息延迟并提升75%的吞吐量。有趣的是,研究表明改善无人机信息共享不会牺牲网络容量,反而能同步提升学习性能与吞吐量。该方法还能有效减少无人机信息交互需求,从而促进多智能体深度强化学习在无人机辅助无线网络中的实际部署。