Mean field type games (MFTGs) describe Nash equilibria between large coalitions: each coalition consists of a continuum of cooperative agents who maximize the average reward of their coalition while interacting non-cooperatively with a finite number of other coalitions. Although the theory has been extensively developed, we are still lacking efficient and scalable computational methods. Here, we develop reinforcement learning methods for such games in a finite space setting with general dynamics and reward functions. We start by proving that MFTG solution yields approximate Nash equilibria in finite-size coalition games. We then propose two algorithms. The first is based on quantization of mean-field spaces and Nash Q-learning. We provide convergence and stability analysis. We then propose a deep reinforcement learning algorithm, which can scale to larger spaces. Numerical experiments in 5 environments with mean-field distributions of dimension up to $200$ show the scalability and efficiency of the proposed method.
翻译:平均场类型博弈(MFTG)描述了大型联盟间的纳什均衡:每个联盟由连续统的合作智能体组成,这些智能体在最大化其联盟平均奖励的同时,与有限数量的其他联盟进行非合作交互。尽管该理论已得到广泛发展,我们仍缺乏高效且可扩展的计算方法。本文针对具有一般动力学和奖励函数的有限空间设定,为此类博弈开发强化学习方法。我们首先证明MFTG解可在有限规模联盟博弈中产生近似纳什均衡。随后提出两种算法:第一种基于平均场空间的量化与纳什Q学习,并提供了收敛性与稳定性分析;第二种为深度强化学习算法,可扩展至更大规模空间。在平均场分布维度高达$200$的五个环境中进行的数值实验,验证了所提方法的可扩展性与高效性。