In a recent series of papers it has been established that variants of Gradient Descent/Ascent and Mirror Descent exhibit last iterate convergence in convex-concave zero-sum games. Specifically, \cite{DISZ17, LiangS18} show last iterate convergence of the so called "Optimistic Gradient Descent/Ascent" for the case of \textit{unconstrained} min-max optimization. Moreover, in \cite{Metal} the authors show that Mirror Descent with an extra gradient step displays last iterate convergence for convex-concave problems (both constrained and unconstrained), though their algorithm does not follow the online learning framework; it uses extra information rather than \textit{only} the history to compute the next iteration. In this work, we show that "Optimistic Multiplicative-Weights Update (OMWU)" which follows the no-regret online learning framework, exhibits last iterate convergence locally for convex-concave games, generalizing the results of \cite{DP19} where last iterate convergence of OMWU was shown only for the \textit{bilinear case}. We complement our results with experiments that indicate fast convergence of the method.
翻译:最近一系列研究已证实,梯度下降/上升和镜像下降的变体在凸凹零和博弈中展现出末点收敛特性。具体而言,\cite{DISZ17, LiangS18} 证明了所谓“乐观梯度下降/上升”在\textit{无约束}极小极大优化情况下的末点收敛。此外,\cite{Metal} 的作者表明,增加额外梯度步的镜像下降在凸凹问题(包括约束和无约束情形)中呈现末点收敛,尽管其算法未遵循在线学习框架——它利用额外信息而非\textit{仅}依赖历史信息计算下一迭代。本工作中,我们证明遵循无遗憾在线学习框架的“乐观乘性权重更新法(OMWU)”在凸凹博弈中具有局部末点收敛性,这推广了 \cite{DP19} 的结果(该研究仅证明了OMWU在\textit{双线性情况}下的末点收敛)。我们通过实验佐证了该方法的快速收敛性。