In the realm of gradient-based optimization, Nesterov's accelerated gradient method (NAG) is a landmark advancement, achieving an accelerated convergence rate that outperforms the vanilla gradient descent method for convex function. However, for strongly convex functions, whether NAG converges linearly remains an open question, as noted in the comprehensive review by Chambolle and Pock [2016]. This issue, aside from the critical step size, was addressed by Li et al. [2024a] using a high-resolution differential equation framework. Furthermore, Beck [2017, Section 10.7.4] introduced a monotonically convergent variant of NAG, referred to as M-NAG. Despite these developments, the Lyapunov analysis presented in [Li et al., 2024a] cannot be directly extended to M-NAG. In this paper, we propose a modification to the iterative relation by introducing a gradient term, leading to a new gradient-based iterative relation. This adjustment allows for the construction of a novel Lyapunov function that excludes kinetic energy. The linear convergence derived from this Lyapunov function is independent of both the parameters of the strongly convex functions and the step size, yielding a more general and robust result. Notably, we observe that the gradient iterative relation derived from M-NAG is equivalent to that from NAG when the position-velocity relation is applied. However, the Lyapunov analysis does not rely on the position-velocity relation, allowing us to extend the linear convergence to M-NAG. Finally, by utilizing two proximal inequalities, which serve as the proximal counterparts of strongly convex inequalities, we extend the linear convergence to both the fast iterative shrinkage-thresholding algorithm (FISTA) and its monotonic counterpart (M-FISTA).
翻译:在基于梯度的优化领域,Nesterov加速梯度法(NAG)是一项里程碑式的进展,它实现了比标准梯度下降法更快的收敛速度,适用于凸函数。然而,对于强凸函数,NAG是否具有线性收敛性仍是一个悬而未决的问题,正如Chambolle和Pock[2016]的综述中所指出的。除了关键步长问题外,Li等人[2024a]利用高分辨率微分方程框架解决了这一问题。此外,Beck[2017, Section 10.7.4]引入了NAG的一个单调收敛变体,称为M-NAG。尽管取得了这些进展,但[Li et al., 2024a]中提出的李雅普诺夫分析无法直接推广到M-NAG。本文通过引入梯度项对迭代关系进行修正,提出了一种新的基于梯度的迭代关系。这一调整使得我们能够构建一个不包含动能项的新型李雅普诺夫函数。由该李雅普诺夫函数导出的线性收敛性独立于强凸函数的参数和步长,从而得到了一个更普遍且稳健的结果。值得注意的是,我们发现当应用位置-速度关系时,从M-NAG推导出的梯度迭代关系与从NAG推导出的等价。然而,李雅普诺夫分析并不依赖于位置-速度关系,这使得我们能够将线性收敛性推广到M-NAG。最后,通过利用两个邻近不等式(作为强凸不等式的邻近对应形式),我们将线性收敛性推广到了快速迭代收缩阈值算法(FISTA)及其单调对应版本(M-FISTA)。