Multi-agent systems require effective coordination between groups and individuals to achieve common goals. However, current multi-agent reinforcement learning (MARL) methods primarily focus on improving individual policies and do not adequately address group-level policies, which leads to weak cooperation. To address this issue, we propose a novel Consensus-oriented Strategy (CoS) that emphasizes group and individual policies simultaneously. Specifically, CoS comprises two main components: (a) the vector quantized group consensus module, which extracts discrete latent embeddings that represent the stable and discriminative group consensus, and (b) the group consensus-oriented strategy, which integrates the group policy using a hypernet and the individual policies using the group consensus, thereby promoting coordination at both the group and individual levels. Through empirical experiments on cooperative navigation tasks with both discrete and continuous spaces, as well as Google research football, we demonstrate that CoS outperforms state-of-the-art MARL algorithms and achieves better collaboration, thus providing a promising solution for achieving effective coordination in multi-agent systems.
翻译:多智能体系统需要群体与个体间的有效协调以实现共同目标。然而,当前多智能体强化学习(MARL)方法主要聚焦于改进个体策略,未能充分处理群体层面策略,导致协作能力薄弱。为解决此问题,我们提出一种新型面向共识策略(CoS),该策略同时强调群体与个体策略。具体而言,CoS包含两个核心组件:(a)向量量化群体共识模块,该模块提取表征稳定且具有区分性的群体共识的离散潜在嵌入;以及(b)面向群体共识策略,其通过超网络整合群体策略,并利用群体共识整合个体策略,从而在群体与个体两个层面促进协调性。通过离散与连续空间下的协作导航任务及谷歌研究足球的实证实验,我们证明CoS优于现有最优MARL算法并实现更优协作,为多智能体系统的有效协调提供了极具前景的解决方案。