This paper investigates an interference-aware joint path planning and power allocation mechanism for a cellular-connected unmanned aerial vehicle (UAV) in a sparse suburban environment. The UAV's goal is to fly from an initial point and reach a destination point by moving along the cells to guarantee the required quality of service (QoS). In particular, the UAV aims to maximize its uplink throughput and minimize the level of interference to the ground user equipment (UEs) connected to the neighbor cellular BSs, considering the shortest path and flight resource limitation. Expert knowledge is used to experience the scenario and define the desired behavior for the sake of the agent (i.e., UAV) training. To solve the problem, an apprenticeship learning method is utilized via inverse reinforcement learning (IRL) based on both Q-learning and deep reinforcement learning (DRL). The performance of this method is compared to learning from a demonstration technique called behavioral cloning (BC) using a supervised learning approach. Simulation and numerical results show that the proposed approach can achieve expert-level performance. We also demonstrate that, unlike the BC technique, the performance of our proposed approach does not degrade in unseen situations.
翻译:本文研究了一种面向干扰感知的联合路径规划与功率分配机制,用于稀疏郊区环境中蜂窝连接的无人机(UAV)。该无人机的目标是从初始点起飞并到达目标点,通过沿蜂窝移动以保障所需服务质量(QoS)。具体而言,无人机在考虑最短路径和飞行资源限制的前提下,旨在最大化其上行链路吞吐量,同时最小化对连接至相邻蜂窝基站的地面用户设备(UE)的干扰水平。研究利用专家知识体验场景,为智能体(即无人机)训练定义期望行为。为解决该问题,采用基于Q学习与深度强化学习(DRL)的逆向强化学习(IRL)学徒学习方法。将该方法的性能与基于监督学习的行为克隆(BC)演示学习技术进行对比。仿真与数值结果表明,所提方法能够达到专家级性能。同时证明,与BC技术不同,所提方法在未见场景下性能不会退化。