Dynamic resource allocation in heterogeneous wireless networks (HetNets) is challenging for traditional methods under varying user loads and channel conditions. We propose a deep reinforcement learning (DRL) framework that jointly optimises transmit power, bandwidth, and scheduling via a multi-objective reward balancing throughput, energy efficiency, and fairness. Using real base station coordinates, we compare Proximal Policy Optimisation (PPO) and Twin Delayed Deep Deterministic Policy Gradient (TD3) against three heuristic algorithms in multiple network scenarios. Our results show that DRL frameworks outperform heuristic algorithms in optimising resource allocation in dynamic networks. These findings highlight key trade-offs in DRL design for future HetNets.
翻译:在异构无线网络(HetNets)中,动态资源分配在变化的用户负载和信道条件下对传统方法构成挑战。我们提出了一种深度强化学习(DRL)框架,通过一个平衡吞吐量、能效和公平性的多目标奖励函数,联合优化发射功率、带宽和调度。利用真实的基站坐标,我们在多种网络场景下比较了近端策略优化(PPO)、双延迟深度确定性策略梯度(TD3)与三种启发式算法。结果表明,在动态网络的资源分配优化方面,DRL框架优于启发式算法。这些发现突显了未来HetNets中DRL设计的关键权衡。