In this paper, we investigate a multi-receiver communication system enabled by movable antennas (MAs). Specifically, the transmit beamforming and the double-side antenna movement at the transceiver are jointly designed to maximize the sum-rate of all receivers under imperfect channel state information (CSI). Since the formulated problem is non-convex with highly coupled variables, conventional optimization methods cannot solve it efficiently. To address these challenges, an effective learning-based algorithm is proposed, namely heterogeneous multi-agent deep deterministic policy gradient (MADDPG), which incorporates two agents to learn policies for beamforming and movement of MAs, respectively. Based on the offline learning under numerous imperfect CSI, the proposed heterogeneous MADDPG can output the solutions for transmit beamforming and antenna movement in real time. Simulation results validate the effectiveness of the proposed algorithm, and the MA can significantly improve the sum-rate performance of multiple receivers compared to other benchmark schemes.
翻译:本文研究了一种由可动天线(MA)支持的多接收机通信系统。具体而言,在信道状态信息(CSI)不完美的情况下,联合设计发射波束成形与收发端双面天线移动,以最大化所有接收机的总速率。由于所构建的问题具有高度耦合变量的非凸性,传统优化方法难以高效求解。为应对这些挑战,提出了一种有效的基于学习的算法,即异构多智能体深度确定性策略梯度(MADDPG),该算法包含两个智能体,分别学习波束成形和天线移动的策略。基于大量不完美CSI下的离线学习,所提出的异构MADDPG能够实时输出发射波束成形和天线移动的解决方案。仿真结果验证了所提算法的有效性,并且与其他基准方案相比,可动天线能够显著提升多接收机的总速率性能。