The integration of subterranean LoRaWAN and non-terrestrial networks (NTN) delivers substantial economic and societal benefits in remote agriculture and disaster rescue operations. The LoRa modulation leverages quasi-orthogonal spreading factors (SFs) to optimize data rates, airtime, coverage and energy consumption. However, it is still challenging to effectively assign SFs to end devices for minimizing co-SF interference in massive subterranean LoRaWAN NTN. To address this, we investigate a reinforcement learning (RL)-based SFs allocation scheme to optimize the system's energy efficiency (EE). To efficiently capture the device-to-environment interactions in dense networks, we proposed an SFs allocation technique using the multi-agent dueling double deep Q-network (MAD3QN) and the multi-agent advantage actor-critic (MAA2C) algorithms based on an analytical reward mechanism. Our proposed RL-based SFs allocation approach evinces better performance compared to four benchmarks in the extreme underground direct-to-satellite scenario. Remarkably, MAD3QN shows promising potentials in surpassing MAA2C in terms of convergence rate and EE.
翻译:地下LoRaWAN与非地面网络(NTN)的融合在偏远农业和灾害救援行动中具有显著的经济与社会效益。LoRa调制利用准正交扩频因子(SF)优化数据传输速率、空中时间、覆盖范围及能耗。然而,在大规模地下LoRaWAN NTN中,如何为终端设备有效分配扩频因子以最小化同频干扰仍具挑战性。针对此问题,我们研究了一种基于强化学习(RL)的扩频因子分配方案以优化系统能效(EE)。为高效捕捉密集网络中设备与环境的交互,我们提出基于多智能体决斗双深度Q网络(MAD3QN)与多智能体优势行动者-评论家(MAA2C)算法的扩频因子分配技术,并引入分析型奖励机制。在极端地下直连卫星场景下,我们提出的RL扩频因子分配方法相较于四种基准方案展现出更优性能。值得注意的是,MAD3QN在收敛速度与能效方面展现出超越MAA2C的潜力。