As next-generation Internet of Things (NG-IoT) networks continue to grow, the number of connected devices is rapidly increasing, along with their energy demands, creating challenges for resource management and sustainability. Energy-efficient communication, particularly for power-limited IoT devices, is therefore a key research focus. In this paper, we study Long Range (LoRa) networks supported by multiple unmanned aerial vehicles (UAVs) in an uplink data collection scenario. Our objective is to maximize system energy efficiency by jointly optimizing transmission power, spreading factor, bandwidth, and user association. To address this challenging problem, we first model it as a partially observable stochastic game (POSG) to account for dynamic channel conditions, end device mobility, and partial observability at each UAV. We then propose a two-stage solution: a channel-aware matching algorithm for end device-UAV association and a cooperative multi-agent reinforcement learning (MARL) based multi-agent proximal policy optimization (MAPPO) framework for resource allocation under centralized training with decentralized execution (CTDE). Simulation results show that our proposed approach significantly outperforms conventional off-policy and on-policy MARL algorithms.
翻译:随着下一代物联网(NG-IoT)网络持续扩展,连接设备数量迅速增长,其能量需求也随之增加,这给资源管理和可持续性带来了挑战。因此,能效通信(尤其对于功率受限的物联网设备)成为关键研究焦点。本文研究多架无人驾驶飞行器(UAV)支持下的远程(LoRa)网络上行数据收集场景。我们的目标是通过联合优化发射功率、扩频因子、带宽和用户关联,最大化系统能效。为应对这一复杂问题,首先将其建模为部分可观察随机博弈(POSG),以考虑动态信道条件、终端设备移动性以及每架UAV的部分可观察性。随后我们提出一种两阶段解决方案:用于终端设备-UAV关联的信道感知匹配算法,以及基于合作式多智能体强化学习(MARL)的多智能体近端策略优化(MAPPO)框架,该框架采用集中式训练与分布式执行(CTDE)方式实现资源分配。仿真结果表明,所提方法显著优于传统离策略和在策略MARL算法。