We revisit the general framework introduced by Fazylab et al. (SIAM J. Optim. 28, 2018) to construct Lyapunov functions for optimization algorithms in discrete and continuous time. For smooth, strongly convex objective functions, we relax the requirements necessary for such a construction. As a result we are able to prove for Polyak's ordinary differential equations and for a two-parameter family of Nesterov algorithms rates of convergence that improve on those available in the literature. We analyse the interpretation of Nesterov algorithms as discretizations of the Polyak equation. We show that the algorithms are instances of Additive Runge-Kutta integrators and discuss the reasons why most discretizations of the differential equation do not result in optimization algorithms with acceleration. We also introduce a modification of Polyak's equation and study its convergence properties. Finally we extend the general framework to the stochastic scenario and consider an application to random algorithms with acceleration for overparameterized models; again we are able to prove convergence rates that improve on those in the literature.
翻译:我们重新审视Fazylab等人(SIAM J. Optim. 28, 2018)提出的离散与连续时间优化算法Lyapunov函数构造通用框架。针对光滑强凸目标函数,我们放宽了此类构造所需的条件。基于此,我们成功证明了Polyak常微分方程及两参数Nesterov算法族的收敛速率,其结果优于现有文献。我们分析了将Nesterov算法视为Polyak方程离散化处理的解释,并证明这些算法属于加性Runge-Kutta积分器,同时探讨了多数微分方程离散化无法产生加速优化算法的原因。此外,我们引入Polyak方程的改进形式并研究其收敛特性。最后,我们将该通用框架扩展至随机场景,并考虑其在超参数化模型加速随机算法中的应用,再次证明了优于现有文献的收敛速率。