Beam codebooks are integral components of the future millimeter wave (mmWave) multiple input multiple output (MIMO) system to relax the reliance on the instantaneous channel state information (CSI). The design of these codebooks, therefore, becomes one of the fundamental problems for these systems, and the well-designed codebooks play key roles in enabling efficient and reliable communications. Prior work has primarily focused on the codebook learning problem within a single cell/network and under stationary interference. In this work, we generalize the interference-aware codebook learning problem to networks with multiple cells/basestations. One of the key differences compared to the single-cell codebook learning problem is that the underlying environment becomes non-stationary, as the behavior of one base station will influence the learning of the others. Moreover, to encompass some of the challenging scenarios, information exchange between the different learning nodes is not allowed, which leads to a fully decentralized system with significantly increased learning difficulties. To tackle the non-stationarity, the averaging of the measurements is used to estimate the interference nulling performance of a particular beam, based on which a decision rule is provided. Furthermore, we theoretically justify the adoption of such estimator and prove that it is a sufficient statistic for the underlying quantity of interest in an asymptotic sense. Finally, a novel reward function based on averaging is proposed to fully decouple the learning of the multiple agents running at different nodes. Simulation results show that the developed solution is capable of learning well-shaped codebook patterns for different networks that significantly suppress the interference without information exchange, highlighting ...
翻译:波束码本是未来毫米波多输入多输出(MIMO)系统的重要组成部分,旨在减少对瞬时信道状态信息(CSI)的依赖。因此,这些码本的设计成为该系统中的基础问题之一,而精心设计的码本在实现高效可靠的通信中发挥着关键作用。以往的研究主要关注单小区/网络内且干扰静态条件下的码本学习问题。在本工作中,我们将干扰感知码本学习问题推广到多小区/多基站网络。与单小区码本学习问题相比,一个关键区别在于底层环境变为非平稳,因为一个基站的行为会影响其他基站的学习过程。此外,为了涵盖一些具有挑战性的场景,不同学习节点之间不允许信息交换,这导致了一个完全去中心化的系统,显著增加了学习难度。为解决非平稳性问题,我们采用测量平均来估计特定波束的干扰抑制性能,并基于此提供决策规则。进一步地,我们从理论上证明采用该估计器的合理性,并证明它渐近意义上是对目标潜在量的充分统计量。最后,提出一种基于平均的新奖励函数,以完全解耦运行在不同节点的多个智能体的学习过程。仿真结果表明,所提出的解决方案能够为不同网络学习形状良好的码本模式,无需信息交换即可显著抑制干扰,突显了其在分布式系统中的应用潜力。