In this paper, we propose an accelerated quasi-Newton proximal extragradient (A-QPNE) method for solving unconstrained smooth convex optimization problems. With access only to the gradients of the objective, we prove that our method can achieve a convergence rate of ${O}\bigl(\min\{\frac{1}{k^2}, \frac{\sqrt{d\log k}}{k^{2.5}}\}\bigr)$, where $d$ is the problem dimension and $k$ is the number of iterations. In particular, in the regime where $k = {O}(d)$, our method matches the optimal rate of ${O}(\frac{1}{k^2})$ by Nesterov's accelerated gradient (NAG). Moreover, in the the regime where $k = \Omega(d \log d)$, it outperforms NAG and converges at a faster rate of ${O}\bigl(\frac{\sqrt{d\log k}}{k^{2.5}}\bigr)$. To the best of our knowledge, this result is the first to demonstrate a provable gain of a quasi-Newton-type method over NAG in the convex setting. To achieve such results, we build our method on a recent variant of the Monteiro-Svaiter acceleration framework and adopt an online learning perspective to update the Hessian approximation matrices, in which we relate the convergence rate of our method to the dynamic regret of a specific online convex optimization problem in the space of matrices.
翻译:本文提出了一种加速拟牛顿近端外梯度法(A-QPNE),用于求解无约束光滑凸优化问题。在仅依赖目标函数梯度信息的前提下,我们证明了该方法能达到 ${O}\bigl(\min\{\frac{1}{k^2}, \frac{\sqrt{d\log k}}{k^{2.5}}\}\bigr)$ 的收敛速率,其中 $d$ 为问题维度,$k$ 为迭代次数。特别地,当 $k = {O}(d)$ 时,本方法的收敛速率与Nesterov加速梯度法(NAG)的最优速率 ${O}(\frac{1}{k^2})$ 相匹配;当 $k = \Omega(d \log d)$ 时,本方法以更快的速率 ${O}\bigl(\frac{\sqrt{d\log k}}{k^{2.5}}\bigr)$ 超越NAG。据我们所知,这是首个在凸优化场景中证明拟牛顿型方法相较NAG具有可验证优势的结果。为实现这一突破,我们将所提方法建立在Monteiro-Svaiter加速框架的最新变体之上,并采用在线学习视角更新Hessian近似矩阵,从而将算法的收敛速率与矩阵空间中特定在线凸优化问题的动态遗憾相关联。