Several widely-used first-order saddle-point optimization methods yield an identical continuous-time ordinary differential equation (ODE) that is identical to that of the Gradient Descent Ascent (GDA) method when derived naively. However, the convergence properties of these methods are qualitatively different, even on simple bilinear games. Thus the ODE perspective, which has proved powerful in analyzing single-objective optimization methods, has not played a similar role in saddle-point optimization. We adopt a framework studied in fluid dynamics -- known as High-Resolution Differential Equations (HRDEs) -- to design differential equation models for several saddle-point optimization methods. Critically, these HRDEs are distinct for various saddle-point optimization methods. Moreover, in bilinear games, the convergence properties of the HRDEs match the qualitative features of the corresponding discrete methods. Additionally, we show that the HRDE of Optimistic Gradient Descent Ascent (OGDA) exhibits \emph{last-iterate convergence} for general monotone variational inequalities. Finally, we provide rates of convergence for the \emph{best-iterate convergence} of the OGDA method, relying solely on the first-order smoothness of the monotone operator.
翻译:几种广泛使用的一阶鞍点优化方法在朴素推导下产生的连续时间常微分方程(ODE)与梯度下降上升法(GDA)完全一致。然而,即使在简单的双线性博弈中,这些方法的收敛性质也存在本质差异。因此,在单目标优化方法分析中卓有成效的ODE视角,在鞍点优化中未能发挥类似作用。我们借鉴流体动力学中研究的高分辨率微分方程(HRDE)框架,为多种鞍点优化方法设计微分方程模型。关键的是,这些HRDE在不同鞍点优化方法间具有显著差异。此外,在双线性博弈中,HRDE的收敛性质与对应离散方法的定性特征相匹配。我们进一步证明,乐观梯度下降上升法(OGDA)的HRDE对于一般单调变分不等式具有末次迭代收敛性。最后,我们仅依赖单调算子的一阶光滑性,给出了OGDA方法最优迭代收敛的收敛速率。