The low-altitude Internet of Things (IoT), supported by unmanned aerial vehicles (UAVs), provides ground sensing networks with advanced real-time monitoring and data collection. To maximize data collection volume from distributed IoT nodes, AI-powered data collection technology plays a critical role in enabling intelligent decision-making. Among them, deep reinforcement learning (DRL) has gained particular attention. However, existing DRL-based work on UAV-assisted IoT data collection rarely addresses challenges such as interference and dynamic data volume, while also suffering from high computational demands and slow convergence. To address these challenges, a hierarchical DRL (HDRL) is designed to optimize UAV trajectories and bandwidth allocation to maximize data collection volume. Firstly, the proposed scenario incorporates interference, dynamic data volume of IoT nodes, and multiple types of obstacles. The entire task is hierarchically structured: the upper-level makes flight trajectory decisions at a coarse temporal granularity, while the lower-level makes bandwidth allocation decisions at a finer temporal granularity. Secondly, a trajectory and bandwidth allocation optimization algorithm based on hierarchical deep deterministic policy gradients (TBH-DDPG) is proposed to solve the problem. Finally, simulation results demonstrate that the proposed algorithm improves convergence speed by 44.44%, and reduces computational cost by 58.05%, compared to non-hierarchical algorithm.
翻译:由无人机支撑的低空物联网为地面传感网络提供了先进的实时监测和数据采集能力。为最大化分布式物联网节点的数据采集量,基于人工智能的数据采集技术在实现智能决策中发挥着关键作用,其中深度强化学习备受关注。然而,现有基于深度强化学习的无人机辅助物联网数据采集工作鲜少考虑干扰和动态数据量等挑战,同时存在计算需求高、收敛速度慢的问题。针对这些挑战,本文设计了一种分层深度强化学习方法,以优化无人机轨迹和带宽分配,从而最大化数据采集量。首先,所提场景考虑了干扰、物联网节点动态数据量以及多种障碍物类型。整个任务采用分层结构:上层以粗时间粒度制定飞行轨迹决策,下层以细时间粒度制定带宽分配决策。其次,提出一种基于分层深度确定性策略梯度的轨迹与带宽分配优化算法以求解该问题。最后,仿真结果表明,与非分层算法相比,所提算法收敛速度提升44.44%,计算成本降低58.05%。