The large number of antennas in massive MIMO systems allows the base station to communicate with multiple users at the same time and frequency resource with multi-user beamforming. However, highly correlated user channels could drastically impede the spectral efficiency that multi-user beamforming can achieve. As such, it is critical for the base station to schedule a suitable group of users in each transmission interval to achieve maximum spectral efficiency while adhering to fairness constraints among the users. User scheduling is an NP-hard problem, with complexity growing exponentially with the number of users. In this paper, we consider the user scheduling problem for massive MIMO systems. Inspired by recent achievements in deep reinforcement learning (DRL) to solve problems with large action sets, we propose \name{}, a dynamic scheduler for massive MIMO based on the state-of-the-art Soft Actor-Critic (SAC) DRL model and the K-Nearest Neighbors (KNN) algorithm. Through comprehensive simulations using realistic massive MIMO channel models as well as real-world datasets from channel measurement experiments, we demonstrate the effectiveness of our proposed model in various channel conditions. Our results show that our proposed model performs very close to the optimal proportionally fair (PF) scheduler in terms of spectral efficiency and fairness with more than one order of magnitude lower computational complexity in medium network sizes where PF is computationally feasible. Our results also show the feasibility and high performance of our proposed scheduler in networks with a large number of users.
翻译:海量MIMO系统中大规模天线阵列使基站能够通过多用户波束成形在同一时频资源上与多个用户通信。然而,高度相关的用户信道会严重制约多用户波束成形的频谱效率。因此,基站在每个传输时隙调度合适的用户群以在满足用户间公平性约束的同时实现最大频谱效率至关重要。用户调度是一个NP-hard问题,其计算复杂度随用户数量呈指数增长。本文研究海量MIMO系统中的用户调度问题,受深度强化学习(DRL)在解决大规模动作集问题方面最新进展的启发,提出基于先进Soft Actor-Critic (SAC) DRL模型和K近邻(KNN)算法的海量MIMO动态调度器\name{}。通过采用真实海量MIMO信道模型和信道测量实验实际数据集的综合仿真,我们验证了所提模型在各种信道条件下的有效性。结果表明,在比例公平(PF)调度器计算可行的中等规模网络中,所提模型在频谱效率和公平性方面与最优PF调度器性能非常接近,而计算复杂度降低一个数量级以上。同时,实验结果也证明了所提调度器在拥有大量用户网络中的可行性和高性能。