In sixth-generation (6G) ultra-dense networks, aggressive frequency reuse amplifies inter-cell interference (ICI), making multi-cell orthogonal frequency-division multiple access (OFDMA) scheduling and power control strongly coupled across neighboring cells. We study distributed downlink resource management -- joint subcarrier scheduling and power allocation -- under interference coupling and long-term per-user quality-of-service (QoS) minimum-rate constraints. By using virtual-queue deficit weights to enforce long-term QoS, we develop FedCritic, a serverless federated multi-agent actor-critic framework with decentralized execution. Unlike centralized training with decentralized execution (CTDE) approaches that require centralized critic learning and joint trajectory aggregation, FedCritic federates the critic through lightweight gossip-based parameter averaging over the interference graph, enabling stable value estimation without a central coordinator while keeping policies local. Simulations in an interference-rich reuse-1 setting show that FedCritic improves mean signal-to-interference-plus-noise ratio (SINR) and cell-edge rate, increases network-wide average sum-rate and fairness relative to non-coordinated and CTDE baselines, and achieves more stable training with lower coordination overhead.
翻译:第六代(6G)超密集网络中,激进的频率复用加剧了小区间干扰(ICI),使得多小区正交频分多址(OFDMA)调度与功率控制在各相邻小区间强耦合。我们研究在干扰耦合与长期每用户服务质量(QoS)最小速率约束下的分布式下行链路资源管理——联合子载波调度与功率分配。通过采用虚队列赤字权重强制执行长期QoS约束,我们提出了FedCritic——一种去中心化执行的无服务器联邦多智能体演员-评判框架。与需要集中式评判学习及联合轨迹聚合、采用集中式训练去中心化执行(CTDE)的方法不同,FedCritic通过基于干扰图的轻量级八卦机制对评判网络参数进行平均化联邦,无需中央协调器即可实现稳定的值估计,同时保持策略的本地性。在干扰严重的复用因子为1场景下的仿真表明,与非协作和CTDE基线相比,FedCritic提升了平均信干噪比(SINR)和小区边缘速率,增加了网络范围平均总速率和公平性,并以更低的协调开销实现了更稳定的训练。