Online Learning Guided Curvature Approximation: A Quasi-Newton Method with Global Non-Asymptotic Superlinear Convergence

Quasi-Newton algorithms are among the most popular iterative methods for solving unconstrained minimization problems, largely due to their favorable superlinear convergence property. However, existing results for these algorithms are limited as they provide either (i) a global convergence guarantee with an asymptotic superlinear convergence rate, or (ii) a local non-asymptotic superlinear rate for the case that the initial point and the initial Hessian approximation are chosen properly. Furthermore, these results are not composable, since when the iterates of the globally convergent methods reach the region of local superlinear convergence, it cannot be guaranteed the Hessian approximation matrix will satisfy the required conditions for a non-asymptotic local superlienar convergence rate. In this paper, we close this gap and present the first globally convergent quasi-Newton method with an explicit non-asymptotic superlinear convergence rate. Unlike classical quasi-Newton methods, we build our algorithm upon the hybrid proximal extragradient method and propose a novel online learning framework for updating the Hessian approximation matrices. Specifically, guided by the convergence analysis, we formulate the Hessian approximation update as an online convex optimization problem in the space of matrices, and relate the bounded regret of the online problem to the superlinear convergence of our method.

翻译：拟牛顿算法是求解无约束极小化问题最流行的迭代方法之一，主要归功于其优越的超线性收敛性质。然而，现有关于这些算法的结果存在局限性，它们要么提供具有渐近超线性收敛率的全局收敛保证，要么针对初始点和初始海森近似选择恰当的情形提供局部非渐近超线性收敛率。此外，这些结果不可组合，因为当全局收敛方法的迭代点进入局部超线性收敛区域时，无法保证海森近似矩阵满足非渐近局部超线性收敛率所需的条件。在本文中，我们填补了这一空白，并提出了首个具有显式非渐近超线性收敛率的全局收敛拟牛顿方法。与经典拟牛顿方法不同，我们基于混合近端外梯度方法构建算法，并提出了一种新颖的在线学习框架来更新海森近似矩阵。具体而言，在收敛性分析的指导下，我们将海森近似更新问题转化为矩阵空间中的在线凸优化问题，并将在线问题的有界遗憾与方法的超线性收敛性联系起来。

相关内容

拟牛顿法

关注 1

拟牛顿法(Quasi-Newton Methods)是求解非线性优化问题最有效的方法之一，于20世纪50年代由美国Argonne国家实验室的物理学家W. C. Davidon所提出来。Davidon设计的这种算法在当时看来是非线性优化领域最具创造性的发明之一。不久R. Fletcher和M. J. D. Powell证实了这种新的算法远比其他方法快速和可靠，使得非线性优化这门学科在一夜之间突飞猛进。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

45+阅读 · 2020年12月18日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

55+阅读 · 2020年9月7日