We study last-iterate convergence properties of algorithms for solving two-player zero-sum games based on Regret Matching$^+$ (RM$^+$). Despite their widespread use for solving real games, virtually nothing is known about their last-iterate convergence. A major obstacle to analyzing RM-type dynamics is that their regret operators lack Lipschitzness and (pseudo)monotonicity. We start by showing numerically that several variants used in practice, such as RM$^+$, predictive RM$^+$ and alternating RM$^+$, all lack last-iterate convergence guarantees even on a simple $3\times 3$ matrix game. We then prove that recent variants of these algorithms based on a smoothing technique, extragradient RM$^{+}$ and smooth Predictive RM$^+$, enjoy asymptotic last-iterate convergence (without a rate), $1/\sqrt{t}$ best-iterate convergence, and when combined with restarting, linear-rate last-iterate convergence. Our analysis builds on a new characterization of the geometric structure of the limit points of our algorithms, marking a significant departure from most of the literature on last-iterate convergence. We believe that our analysis may be of independent interest and offers a fresh perspective for studying last-iterate convergence in algorithms based on non-monotone operators.
翻译:我们研究了基于Regret Matching$^+$(RM$^+$)求解双人零和博弈算法的末点收敛性质。尽管这些算法在求解实际博弈问题中广泛应用,但对其末点收敛性几乎一无所知。分析RM类动态的一个主要障碍在于其遗憾算子缺乏Lipschitz连续性和(伪)单调性。我们首先通过数值实验表明,实践中使用的多种变体(如RM$^+$、预测型RM$^+$和交替RM$^+$)即使在简单的$3\times 3$矩阵博弈上也缺乏末点收敛保证。随后我们证明,基于平滑技术的最新变体——外梯度RM$^{+}$和平滑预测型RM$^+$——具有渐近末点收敛性(无收敛速率)、$1/\sqrt{t}$最优迭代收敛性,且在结合重启策略时具备线性速率末点收敛性。我们的分析建立在对算法极限点几何结构的新刻画之上,这标志着与大多数末点收敛文献的重要分野。我们相信该分析可能具有独立价值,并为研究基于非单调算子的算法末点收敛性提供了全新视角。