In the era of 5G mobile communication, there has been a significant surge in research focused on unmanned aerial vehicles (UAVs) and mobile edge computing technology. UAVs can serve as intelligent servers in edge computing environments, optimizing their flight trajectories to maximize communication system throughput. Deep reinforcement learning (DRL)-based trajectory optimization algorithms may suffer from poor training performance due to intricate terrain features and inadequate training data. To overcome this limitation, some studies have proposed leveraging federated learning (FL) to mitigate the data isolation problem and expedite convergence. Nevertheless, the efficacy of global FL models can be negatively impacted by the high heterogeneity of local data, which could potentially impede the training process and even compromise the performance of local agents. This work proposes a novel solution to address these challenges, namely personalized federated deep reinforcement learning (PF-DRL), for multi-UAV trajectory optimization. PF-DRL aims to develop individualized models for each agent to address the data scarcity issue and mitigate the negative impact of data heterogeneity. Simulation results demonstrate that the proposed algorithm achieves superior training performance with faster convergence rates, and improves service quality compared to other DRL-based approaches.
翻译:在5G移动通信时代,关于无人机和移动边缘计算技术的研究显著增加。无人机可在边缘计算环境中充当智能服务器,通过优化飞行轨迹以最大化通信系统吞吐量。然而,基于深度强化学习的轨迹优化算法可能因复杂地形特征和训练数据不足而导致训练效果不佳。为克服这一局限性,部分研究提出利用联邦学习缓解数据孤岛问题并加速收敛。但全局联邦模型的有效性可能受本地数据高度异质性的负面影响,这可能会阻碍训练过程,甚至损害本地智能体的性能。本文提出一种名为个性化联邦深度强化学习(PF-DRL)的创新方案,用于解决多无人机轨迹优化中的上述挑战。PF-DRL旨在为每个智能体开发个性化模型,以应对数据稀缺问题并减轻数据异质性的负面影响。仿真结果表明,与其他基于深度强化学习的方法相比,所提算法以更快的收敛速度实现了更优的训练性能,并提升了服务质量。