Mean field type games (MFTGs) describe Nash equilibria between large coalitions: each coalition consists of a continuum of cooperative agents who maximize the average reward of their coalition while interacting non-cooperatively with a finite number of other coalitions. Although the theory has been extensively developed, we are still lacking efficient and scalable computational methods. Here, we develop reinforcement learning methods for such games in a finite space setting with general dynamics and reward functions. We start by proving that MFTG solution yields approximate Nash equilibria in finite-size coalition games. We then propose two algorithms. The first is based on quantization of the mean-field spaces and Nash Q-learning. We provide convergence and stability analysis. We then propose an deep reinforcement learning algorithm, which can scale to larger spaces. Numerical examples on 5 environments show the scalability and the efficiency of the proposed method.
翻译:平均场类型博弈(MFTGs)描述了大型联盟间的纳什均衡:每个联盟由连续统的合作智能体组成,这些智能体在最大化其联盟平均奖励的同时,与有限数量的其他联盟进行非合作交互。尽管该理论已得到广泛发展,我们仍缺乏高效且可扩展的计算方法。本文针对有限空间设定下具有一般动力学和奖励函数的此类博弈,开发了强化学习方法。我们首先证明了MFTG解在有限规模联盟博弈中产生近似纳什均衡。随后提出了两种算法。第一种基于平均场空间的量化与纳什Q学习,并提供了收敛性与稳定性分析。接着提出了一种深度强化学习算法,该方法能够扩展至更大空间。在5个环境上的数值实验表明了所提方法的可扩展性与高效性。