Deep Reinforcement Learning for Distributed Dynamic Coordinated Beamforming in Massive MIMO Cellular Networks

To accommodate the explosive wireless traffics, massive multiple-input multiple-output (MIMO) is regarded as one of the key enabling technologies for next-generation communication systems. In massive MIMO cellular networks, coordinated beamforming (CBF), which jointly designs the beamformers of multiple base stations (BSs), is an efficient method to enhance the network performance. In this paper, we investigate the sum rate maximization problem in a massive MIMO mobile cellular network, where in each cell a multi-antenna BS serves multiple mobile users simultaneously via downlink beamforming. Although existing optimization-based CBF algorithms can provide near-optimal solutions, they require realtime and global channel state information (CSI), in addition to their high computation complexity. It is almost impossible to apply them in practical wireless networks, especially highly dynamic mobile cellular networks. Motivated by this, we propose a deep reinforcement learning based distributed dynamic coordinated beamforming (DDCBF) framework, which enables each BS to determine the beamformers with only local CSI and some historical information from other BSs.Besides, the beamformers can be calculated with a considerably lower computational complexity by exploiting neural networks and expert knowledge, i.e., a solution structure observed from the iterative procedure of the weighted minimum mean square error (WMMSE) algorithm. Moreover, we provide extensive numerical simulations to validate the effectiveness of the proposed DRL-based approach. With lower computational complexity and less required information, the results show that the proposed approach can achieve comparable performance to the centralized iterative optimization algorithms.

翻译：为应对爆炸式增长的无线流量需求，大规模多输入多输出（MIMO）被视为下一代通信系统的关键技术之一。在大规模MIMO蜂窝网络中，协调波束赋形（CBF）通过联合设计多个基站（BS）的波束赋形器，是提升网络性能的有效方法。本文研究了大规模MIMO移动蜂窝网络中的和速率最大化问题，其中每个小区的多天线基站通过下行链路波束赋形同时为多个移动用户提供服务。尽管现有基于优化的CBF算法能够提供接近最优的解，但它们需要实时全局信道状态信息（CSI）且计算复杂度高，这几乎无法应用于实际无线网络，尤其是高度动态的移动蜂窝网络。受此启发，我们提出了一种基于深度强化学习的分布式动态协调波束赋形（DDCBF）框架，使每个基站仅需本地CSI及来自其他基站的少量历史信息即可确定波束赋形器。此外，通过利用神经网络和专家知识（即从加权最小均方误差（WMMSE）算法迭代过程中提炼的解决方案结构），波束赋形器的计算复杂度显著降低。我们提供了大量数值仿真以验证所提基于DRL方法的有效性。结果表明，在较低计算复杂度和较少信息需求条件下，所提方法能够达到与集中式迭代优化算法相当的性能。