Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games

Algorithms based on regret matching, specifically regret matching$^+$ (RM$^+$), and its variants are the most popular approaches for solving large-scale two-player zero-sum games in practice. Unlike algorithms such as optimistic gradient descent ascent, which have strong last-iterate and ergodic convergence properties for zero-sum games, virtually nothing is known about the last-iterate properties of regret-matching algorithms. Given the importance of last-iterate convergence for numerical optimization reasons and relevance as modeling real-word learning in games, in this paper, we study the last-iterate convergence properties of various popular variants of RM$^+$. First, we show numerically that several practical variants such as simultaneous RM$^+$, alternating RM$^+$, and simultaneous predictive RM$^+$, all lack last-iterate convergence guarantees even on a simple $3\times 3$ game. We then prove that recent variants of these algorithms based on a smoothing technique do enjoy last-iterate convergence: we prove that extragradient RM$^{+}$ and smooth Predictive RM$^+$ enjoy asymptotic last-iterate convergence (without a rate) and $1/\sqrt{t}$ best-iterate convergence. Finally, we introduce restarted variants of these algorithms, and show that they enjoy linear-rate last-iterate convergence.

翻译：基于遗憾匹配的算法，特别是遗憾匹配$^+$（RM$^+$）及其变体，是实际中求解大规模两人零和博弈最流行的方法。与诸如乐观梯度下降上升等具有强最后迭代和遍历收敛性质的零和博弈算法不同，对于遗憾匹配算法的最后迭代性质几乎一无所知。鉴于最后迭代收敛在数值优化中的重要性以及作为博弈中真实学习建模的相关性，本文研究了RM$^+$多种流行变体的最后迭代收敛性质。首先，我们通过数值实验表明，即使在一个简单的$3\times 3$博弈上，同时RM$^+$、交替RM$^+$和同时预测RM$^+$等几种实用变体均缺乏最后迭代收敛保证。随后我们证明，这些算法基于平滑技术的最新变体确实具有最后迭代收敛性：我们证明了外梯度RM$^{+}$和平滑预测RM$^+$具有渐近最后迭代收敛（无速率）和$1/\sqrt{t}$最佳迭代收敛。最后，我们引入了这些算法的重启变体，并证明它们具有线性速率的最后迭代收敛。

相关内容

ENJOY

关注 1

ENJOY，一个“懂吃、会选、有格调”的美食电商平台——• 一触即享：为你精选优质餐厅定制独家菜单；• 可见可购：优质生活方式快递良品一网打尽；• 精致美丽：专业美食摄影师呈现的高清美图；• 限时优惠：覆盖全品类的专享折扣每日更新；岁月蹉跎，不如好好吃上一顿。ENJOY NOW！ENJOY 致力于解决“如何吃的更好”。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日