We revisit the general framework introduced by Fazylab et al. (SIAM J. Optim. 28, 2018) to construct Lyapunov functions for optimization algorithms in discrete and continuous time. For smooth, strongly convex objective functions, we relax the requirements necessary for such a construction. As a result we are able to prove for Polyak's ordinary differential equations and for a two-parameter family of Nesterov algorithms rates of convergence that improve on those available in the literature. We analyse the interpretation of Nesterov algorithms as discretizations of the Polyak equation. We show that the algorithms are instances of Additive Runge-Kutta integrators and discuss the reasons why most discretizations of the differential equation do not result in optimization algorithms with acceleration. We also introduce a modification of Polyak's equation and study its convergence properties. Finally we extend the general framework to the stochastic scenario and consider an application to random algorithms with acceleration for overparameterized models; again we are able to prove convergence rates that improve on those in the literature.
翻译:我们重新审视Fazylab等人(SIAM J. Optim. 28, 2018)提出的通用框架,该框架用于构造离散与连续时间优化算法的Lyapunov函数。对于光滑且强凸的目标函数,我们放宽了此类构造所需的条件。由此,我们针对Polyak常微分方程以及包含两个参数族的Nesterov算法,证明了收敛速度优于现有文献结果。我们分析了将Nesterov算法解释为Polyak方程离散化形式的合理性,并表明这些算法属于加性Runge-Kutta积分器的特例;同时探讨了为何该微分方程的大多数离散化形式无法产生具有加速效果的优化算法。此外,我们引入Polyak方程的修正形式并研究其收敛性质。最后,我们将该通用框架拓展至随机场景,并考虑其在过参数化模型的加速随机算法中的应用;再次证明我们能够获得优于现有文献的收敛速度。