Newton's method may exhibit slower convergence than vanilla Gradient Descent in its initial phase on strongly convex problems. Classical Newton-type multilevel methods mitigate this but, like Gradient Descent, achieve only linear convergence near the minimizer. We introduce an adaptive multilevel Newton-type method with a principled automatic switch to full Newton once its quadratic phase is reached. The local quadratic convergence for strongly convex functions with Lipschitz continuous Hessians and for self-concordant functions is established and confirmed empirically. Although per-iteration cost can exceed that of classical multilevel schemes, the method is efficient and consistently outperforms Newton's method, Gradient Descent, and the multilevel Newton method, indicating that second-order methods can outperform first-order methods even when Newton's method is initially slow.
翻译:在强凸优化问题上,牛顿法在初始阶段可能表现出比普通梯度下降更慢的收敛速度。经典牛顿型多级方法虽能缓解此问题,但如同梯度下降,在接近极小值点时仅能达到线性收敛。我们提出一种自适应多级牛顿型方法,该方法在达到二次收敛阶段时,通过原理性自动切换机制转为完整牛顿法。我们建立了该方法在具有Lipschitz连续Hessian矩阵的强凸函数及自协调函数上的局部二次收敛性,并通过实验验证。尽管每次迭代成本可能超过经典多级方案,但该方法效率显著,在实验中持续优于牛顿法、梯度下降及多级牛顿法,这表明即使牛顿法初始收敛缓慢,二阶方法仍可超越一阶方法的性能。