Mean Field Game (MFG) is a framework for modeling and approximating the behavior of large numbers of agents. Computing equilibria in MFG has been of interest in multi-agent reinforcement learning. The theoretical guarantee that the last updated policy converges to an equilibrium has been limited. We propose the use of a simple, proximal-point (PP) type method to compute equilibria for MFGs. We then provide the first last-iterate convergence (LIC) guarantee under the Lasry--Lions-type monotonicity condition. We also propose an approximation of the update rule of PP ($\mathtt{APP}$) based on the observation that it is equivalent to solving the regularized MFG, which can be solved by mirror descent. We further establish that the regularized mirror descent achieves LIC at an exponential rate. Our numerical experiment demonstrates that $\mathtt{APP}$ efficiently computes the equilibrium.
翻译:平均场博弈(MFG)是用于建模和近似大量智能体行为的框架。在多智能体强化学习中,计算MFG的均衡点一直备受关注。关于末次更新策略收敛至均衡点的理论保证一直较为有限。我们提出使用一种简单的近端点(PP)类型方法来计算MFG的均衡点。随后,我们在Lasry-Lions型单调性条件下首次提供了末次迭代收敛(LIC)保证。基于PP更新规则等价于求解正则化MFG(可通过镜像下降法求解)这一观察,我们进一步提出了PP更新规则的近似方法($\mathtt{APP}$)。我们进一步证明正则化镜像下降法能够以指数速率实现LIC。数值实验表明,$\mathtt{APP}$能高效计算均衡点。