Follow the regularized leader FTRL is the premier algorithm for online optimization. However, despite decades of research on its convergence in constrained optimization -- and potential games in particular -- its behavior remained hitherto poorly understood. In this paper, we establish that FTRL can take exponential time to converge to a Nash equilibrium in two-player potential games for any (permutation-invariant) regularizer and potentially vanishing learning rate. By known equivalences, this translates to an exponential lower bound for certain mirror descent counterparts, most notably multiplicative weights update. On the positive side, we establish the potential property for FTRL and obtain an exponential upper bound $\exp(O_ε(1/ε^2))$ for any no-regret dynamics executed in a lazy, alternating fashion, matching our lower bound up to factors in the exponent. Finally, in multi-player potential games, we show that fictitious play -- the extreme version of FTRL -- can take doubly exponential time to reach a Nash equilibrium. This constitutes an exponentially stronger lower bound for the foundational learning algorithm in games.
翻译:跟随正则化领导者(FTRL)是在线优化领域的核心算法。然而,尽管对其在约束优化(尤其是势博弈)中的收敛性进行了数十年的研究,其行为机制至今仍未得到充分理解。本文证明,在任意(置换不变)正则化器及可能趋于零的学习率设置下,FTRL在双人势博弈中收敛至纳什均衡可能需要指数时间。通过已知的等价关系,这转化为对某些镜像下降对应算法(最显著的是乘性权重更新法)的指数下界。从积极角度看,我们确立了FTRL的势函数特性,并对以惰性交替方式执行的任意无悔动态获得了指数上界 $\exp(O_ε(1/ε^2))$,该上界与下界在指数阶上相匹配。最后,在多人势博弈中,我们证明虚拟博弈(FTRL的极端形式)可能需要双指数时间才能达到纳什均衡。这为博弈论基础学习算法提供了指数级更强的下界结果。