Fictitious Play (FP) is a simple and natural dynamic for repeated play with many applications in game theory and multi-agent reinforcement learning. It was introduced by Brown (1949,1951) and its convergence properties for two-player zero-sum games was established later by Robinson (1951). Potential games Monderer and Shapley (1996b) is another class of games which exhibit the FP property (Monderer and Shapley (1996a)), i.e., FP dynamics converges to a Nash equilibrium if all agents follows it. Nevertheless, except for two-player zero-sum games and for specific instances of payoff matrices (Abernethy et al. (2021)) or for adversarial tie-breaking rules (Daskalakis and Pan (2014)), the convergence rate of FP is unknown. In this work, we focus on the rate of convergence of FP when applied to potential games and more specifically identical payoff games. We prove that FP can take exponential time (in the number of strategies) to reach a Nash equilibrium, even if the game is restricted to two agents and for arbitrary tie-breaking rules. To prove this, we recursively construct a two-player coordination game with a unique Nash equilibrium. Moreover, every approximate Nash equilibrium in the constructed game must be close to the pure Nash equilibrium in $\ell_1$-distance.
翻译:虚拟博弈是一种简单自然的重复博弈动态,在博弈论和多智能体强化学习中具有广泛应用。该概念由Brown(1949,1951)提出,Robinson(1951)随后证明了其在两人零和博弈中的收敛性。势博弈(Monderer & Shapley, 1996b)是另一类具有虚拟博弈性质的博弈(Monderer & Shapley, 1996a),即所有智能体遵循虚拟博弈动态时,其收敛于纳什均衡。然而,除两人零和博弈、特定收益矩阵实例(Abernethy等人, 2021)或对抗性破平局规则(Daskalakis & Pan, 2014)外,虚拟博弈的收敛速率仍属未知。本文聚焦于势博弈(特别是同收益博弈)中虚拟博弈的收敛速率。我们证明:即使博弈限制为双智能体且采用任意破平局规则,虚拟博弈可能需指数时间(相对于策略数量)才能达到纳什均衡。为此,我们递归构造了一个具有唯一纳什均衡的两人协作博弈,且该博弈中所有近似纳什均衡在$\ell_1$距离上必须接近纯策略纳什均衡。