Graph-Neural Multi-Agent Coordination for Distributed Access-Point Selection in Cell-Free Massive MIMO

Cell-free massive MIMO (CFmMIMO) systems require scalable and reliable distributed coordination mechanisms to operate under stringent communication and latency constraints. A central challenge is the Access Point Selection (APS) problem, which seeks to determine the subset of serving Access Points (APs) for each User Equipment (UE) that can satisfy UEs' Spectral Efficiency (SE) requirements while minimizing network power consumption. We introduce APS-GNN, a scalable distributed multi-agent learning framework that decomposes APS into agents operating at the granularity of individual AP-UE connections. Agents coordinate via local observation exchange over a novel Graph Neural Network (GNN) architecture and share parameters to reuse their knowledge and experience. APS-GNN adopts a constrained reinforcement learning approach to provide agents with explicit observability of APS' conflicting objectives, treating SE satisfaction as a cost and power reduction as a reward. Both signals are defined locally, facilitating effective credit assignment and scalable coordination in large networks. To further improve training stability and exploration efficiency, the policy is initialized via supervised imitation learning from a heuristic APS baseline. We develop a realistic CFmMIMO simulator and demonstrate that APS-GNN delivers the target SE while activating 50-70% fewer APs than heuristic and centralized Multi-agent Reinforcement Learning (MARL) baselines in different evaluation scenarios. Moreover, APS-GNN achieves one to two orders of magnitude lower inference latency than centralized MARL approaches due to its fully parallel and distributed execution. These results establish APS-GNN as a practical and scalable solution for APS in large-scale CFmMIMO networks.

翻译：无蜂窝大规模多输入多输出系统需要在严格的通信与延迟约束下运行，这要求具备可扩展且可靠的分布式协调机制。其中的核心挑战是接入点选择问题，该问题旨在为每个用户设备确定服务接入点子集，在满足用户设备频谱效率需求的同时最小化网络功耗。我们提出了APS-GNN，一种可扩展的分布式多智能体学习框架，将APS问题分解为在单个AP-UE连接粒度上操作的智能体。智能体通过新型图神经网络架构进行本地观测交换以协调行动，并共享参数以实现知识与经验的重用。APS-GNN采用约束强化学习方法，使智能体能够显式观测APS中相互冲突的目标——将频谱效率满足度视为成本，将功耗降低视为奖励。两种信号均在本地定义，有助于在大规模网络中实现有效的信用分配与可扩展协调。为进一步提升训练稳定性与探索效率，我们通过从启发式APS基线进行监督模仿学习来初始化策略。我们开发了高保真无蜂窝大规模MIMO仿真器，并在不同评估场景中证明：相比启发式方法与集中式多智能体强化学习基线，APS-GNN在达到目标频谱效率的同时能够减少50-70%的接入点激活数量。此外，得益于其完全并行与分布式的执行方式，APS-GNN的推理延迟比集中式多智能体强化学习方法降低一到两个数量级。这些结果表明APS-GNN是解决大规模无蜂窝大规模MIMO网络中APS问题的实用且可扩展的方案。