Millimeter-wave (mmWave) communication systems, particularly those leveraging multi-user multiple-input and multiple-output (MU-MIMO) with hybrid beamforming, face challenges in optimizing user throughput and minimizing latency due to the high complexity of dynamic beam selection and management. This paper introduces a deep reinforcement learning (DRL) approach for enhancing user throughput in multi-panel mmWave radio access networks in a practical network setup. Our DRL-based formulation utilizes an adaptive beam management strategy that models the interaction between the communication agent and its environment as a Markov decision process (MDP), optimizing beam selection based on real-time observations. The proposed framework exploits spatial domain (SD) characteristics by incorporating the cross-correlation between the beams in different antenna panels, the measured reference signal received power (RSRP), and the beam usage statistics to dynamically adjust beamforming decisions. As a result, the spectral efficiency is improved and end-to-end latency is reduced. The numerical results demonstrate an increase in throughput of up to 16% and a reduction in latency by factors 3-7x compared to baseline (legacy beam management).
翻译:毫米波通信系统,特别是那些采用混合波束成形的多用户多输入多输出系统,由于动态波束选择与管理的高度复杂性,在优化用户吞吐量和最小化延迟方面面临挑战。本文介绍了一种深度强化学习方法,用于在实际网络设置中增强多面板毫米波无线接入网络的用户吞吐量。我们基于DRL的框架采用了一种自适应波束管理策略,将通信代理与其环境之间的交互建模为马尔可夫决策过程,基于实时观测优化波束选择。所提出的框架通过结合不同天线面板间波束的互相关性、测量的参考信号接收功率以及波束使用统计信息来利用空间域特性,从而动态调整波束成形决策。因此,频谱效率得到提升,端到端延迟得以降低。数值结果表明,与基线相比,吞吐量最高可提升16%,延迟降低3至7倍。