A significant milestone in modern gradient-based optimization was achieved with the development of Nesterov's accelerated gradient descent (NAG) method. This forward-backward technique has been further advanced with the introduction of its proximal generalization, commonly known as the fast iterative shrinkage-thresholding algorithm (FISTA), which enjoys widespread application in image science and engineering. Nonetheless, it remains unclear whether both NAG and FISTA exhibit linear convergence for strongly convex functions. Remarkably, these algorithms demonstrate convergence without requiring any prior knowledge of strongly convex modulus, and this intriguing characteristic has been acknowledged as an open problem in the comprehensive review [Chambolle and Pock, 2016, Appendix B]. In this paper, we address this question by utilizing the high-resolution ordinary differential equation (ODE) framework. Expanding upon the established phase-space representation, we emphasize the distinctive approach employed in crafting the Lyapunov function, which involves a dynamically adapting coefficient of kinetic energy that evolves throughout the iterations. Furthermore, we highlight that the linear convergence of both NAG and FISTA is independent of the parameter $r$. Additionally, we demonstrate that the square of the proximal subgradient norm likewise advances towards linear convergence.
翻译:现代基于梯度的优化领域的一个里程碑式进展是Nesterov加速梯度下降(NAG)方法的提出。这一前后向技术通过引入其近端推广(即快速迭代收缩阈值算法FISTA)得到了进一步发展,并在图像科学与工程领域获得了广泛应用。然而,NAG与FISTA在强凸函数上是否均能展现线性收敛性仍未明确。值得注意的是,这些算法在无需强凸模量先验知识的情况下即可实现收敛,这一有趣特性在综述[Chambolle and Pock, 2016, Appendix B]中被列为开放问题。本文通过利用高分辨率常微分方程(ODE)框架解决该问题。在已有相空间表示的基础上,我们重点阐述了构造Lyapunov函数时采用的独特方法——该方法涉及一个随迭代动态演变的动能系数。此外,我们强调NAG与FISTA的线性收敛性均独立于参数$r$。同时,我们证明近端次梯度范数的平方也趋向于线性收敛。