We present a Newton-type method that converges fast from any initialization and for arbitrary convex objectives with Lipschitz Hessians. We achieve this by merging the ideas of cubic regularization with a certain adaptive Levenberg--Marquardt penalty. In particular, we show that the iterates given by $x^{k+1}=x^k - \bigl(\nabla^2 f(x^k) + \sqrt{H\|\nabla f(x^k)\|} \mathbf{I}\bigr)^{-1}\nabla f(x^k)$, where $H>0$ is a constant, converge globally with a $\mathcal{O}(\frac{1}{k^2})$ rate. Our method is the first variant of Newton's method that has both cheap iterations and provably fast global convergence. Moreover, we prove that locally our method converges superlinearly when the objective is strongly convex. To boost the method's performance, we present a line search procedure that does not need prior knowledge of $H$ and is provably efficient.
翻译:本文提出一种牛顿型方法,该方法从任意初始点出发,对具有Lipschitz连续Hessian矩阵的任意凸目标函数均能快速收敛。我们通过融合三次正则化与一种自适应Levenberg-Marquardt惩罚项的思想实现了这一目标。特别地,我们证明了由$x^{k+1}=x^k - \bigl(\nabla^2 f(x^k) + \sqrt{H\|\nabla f(x^k)\|} \mathbf{I}\bigr)^{-1}\nabla f(x^k)$(其中$H>0$为常数)给出的迭代序列全局收敛且收敛率为$\mathcal{O}(\frac{1}{k^2})$。本方法是牛顿法变体中首个兼具廉价迭代与可证明快速全局收敛性的方法。此外,我们证明了当目标函数强凸时,本方法局部具有超线性收敛性。为提升方法性能,我们还提出了一种无需预先知晓$H$值且可证明高效的线搜索程序。