For partially observable cooperative tasks, multi-agent systems must develop effective communication and understand the interplay among agents in order to achieve cooperative goals. However, existing multi-agent reinforcement learning (MARL) with communication methods lack evaluation metrics for information weights and information-level communication modeling. This causes agents to neglect the aggregation of multiple messages, thereby significantly reducing policy learning efficiency. In this paper, we propose pluggable adaptive generative networks (PAGNet), a novel framework that integrates generative models into MARL to enhance communication and decision-making. PAGNet enables agents to synthesize global states representations from weighted local observations and use these representations alongside learned communication weights for coordinated decision-making. This pluggable approach reduces the computational demands typically associated with the joint training of communication and policy networks. Extensive experimental evaluations across diverse benchmarks and communication scenarios demonstrate the significant performance improvements achieved by PAGNet. Furthermore, we analyze the emergent communication patterns and the quality of generated global states, providing insights into operational mechanisms.
翻译:针对部分可观测的协同任务,多智能体系统必须建立有效通信并理解智能体间的相互作用,以实现协同目标。然而,现有基于通信的多智能体强化学习方法缺乏信息权重评估指标与信息层面的通信建模,导致智能体忽视多源消息的聚合,从而显著降低策略学习效率。本文提出可插拔自适应生成网络,该创新框架将生成模型集成至多智能体强化学习中,以增强通信与决策能力。PAGNet使智能体能够从加权的局部观测中合成全局状态表征,并利用这些表征与习得的通信权重进行协同决策。这种可插拔设计降低了通信网络与策略网络联合训练通常带来的计算需求。在多样化基准测试与通信场景中的大量实验评估表明,PAGNet实现了显著的性能提升。此外,我们分析了涌现的通信模式与生成全局状态的质量,从而揭示其运行机制。