In this paper, we propose a quasi-Newton method for solving smooth and monotone nonlinear equations, including unconstrained minimization and minimax optimization as special cases. For the strongly monotone setting, we establish two global convergence bounds: (i) a linear convergence rate that matches the rate of the celebrated extragradient method, and (ii) an explicit global superlinear convergence rate that provably surpasses the linear convergence rate after at most ${O}(d)$ iterations, where $d$ is the problem's dimension. In addition, for the case where the operator is only monotone, we prove a global convergence rate of ${O}(\min\{{1}/{k},{\sqrt{d}}/{k^{1.25}}\})$ in terms of the duality gap. This matches the rate of the extragradient method when $k = {O}(d^2)$ and is faster when $k = \Omega(d^2)$. These results are the first global convergence results to demonstrate a provable advantage of a quasi-Newton method over the extragradient method, without querying the Jacobian of the operator. Unlike classical quasi-Newton methods, we achieve this by using the hybrid proximal extragradient framework and a novel online learning approach for updating the Jacobian approximation matrices. Specifically, guided by the convergence analysis, we formulate the Jacobian approximation update as an online convex optimization problem over non-symmetric matrices, relating the regret of the online problem to the convergence rate of our method. To facilitate efficient implementation, we further develop a tailored online learning algorithm based on an approximate separation oracle, which preserves structures such as symmetry and sparsity in the Jacobian matrices.
翻译:本文提出了一种用于求解光滑单调非线性方程组的拟牛顿方法,其中无约束最小化与极小极大优化可作为特例。针对强单调情形,我们建立了两种全局收敛界:(i) 与经典外梯度法相匹配的线性收敛速率;(ii) 显式的全局超线性收敛速率,该速率在至多 ${O}(d)$ 次迭代后可证明超越线性收敛速率,其中 $d$ 为问题维度。此外,对于算子仅满足单调性的情形,我们证明了在对偶间隙意义下 ${O}(\min\{{1}/{k},{\sqrt{d}}/{k^{1.25}}\})$ 的全局收敛速率。当 $k = {O}(d^2)$ 时该速率与外梯度法一致,而当 $k = \Omega(d^2)$ 时则更快。这些结果是首次在不查询算子雅可比矩阵的前提下,证明拟牛顿方法相对于外梯度法具有可验证优势的全局收敛性结论。与传统拟牛顿方法不同,我们通过采用混合邻近外梯度框架及一种新颖的在线学习方法来更新雅可比近似矩阵实现该优势。具体而言,在收敛性分析的指导下,我们将雅可比近似更新问题形式化为非对称矩阵上的在线凸优化问题,并将在线问题的遗憾值与本方法的收敛速率建立关联。为实现高效计算,我们进一步开发了基于近似分离预言机的定制在线学习算法,该算法能保持雅可比矩阵的对称性与稀疏性等结构特性。