The large number of antennas in massive MIMO systems allows the base station to communicate with multiple users at the same time and frequency resource with multi-user beamforming. However, highly correlated user channels could drastically impede the spectral efficiency that multi-user beamforming can achieve. As such, it is critical for the base station to schedule a suitable group of users in each time and frequency resource block to achieve maximum spectral efficiency while adhering to fairness constraints among the users. In this paper, we consider the resource scheduling problem for massive MIMO systems with its optimal solution known to be NP-hard. Inspired by recent achievements in deep reinforcement learning (DRL) to solve problems with large action sets, we propose \name{}, a dynamic scheduler for massive MIMO based on the state-of-the-art Soft Actor-Critic (SAC) DRL model and the K-Nearest Neighbors (KNN) algorithm. Through comprehensive simulations using realistic massive MIMO channel models as well as real-world datasets from channel measurement experiments, we demonstrate the effectiveness of our proposed model in various channel conditions. Our results show that our proposed model performs very close to the optimal proportionally fair (Opt-PF) scheduler in terms of spectral efficiency and fairness with more than one order of magnitude lower computational complexity in medium network sizes where Opt-PF is computationally feasible. Our results also show the feasibility and high performance of our proposed scheduler in networks with a large number of users and resource blocks.
翻译:海量MIMO系统的大规模天线阵列使得基站能够通过多用户波束成形,在同时间同频率资源上与多个用户进行通信。然而,高度相关的用户信道会严重制约多用户波束成形所能实现的频谱效率。因此,基站在每个时频资源块上调度合适的用户组,在保证用户间公平性约束的同时实现最大频谱效率至关重要。本文研究了海量MIMO系统的资源调度问题,已知其最优解属于NP-hard问题。受近期深度强化学习在解决大规模动作集问题上取得进展的启发,我们提出了一种基于最先进Soft Actor-Critic深度强化学习模型和K近邻算法的海量MIMO动态调度器\name{}。通过使用真实海量MIMO信道模型及信道测量实验中的实际数据集进行综合仿真,我们验证了所提模型在不同信道条件下的有效性。结果显示,在Opt-PF具有计算可行性的中等规模网络中,所提模型在频谱效率和公平性方面与最优比例公平调度器性能非常接近,而计算复杂度降低了一个数量级以上。结果还表明,在拥有大量用户和资源块的大规模网络中,所提调度器具有可行性且性能优异。