The convergence rates for convex and non-convex optimization methods depend on the choice of a host of constants, including step sizes, Lyapunov function constants and momentum constants. In this work we propose the use of factorial powers as a flexible tool for defining constants that appear in convergence proofs. We list a number of remarkable properties that these sequences enjoy, and show how they can be applied to convergence proofs to simplify or improve the convergence rates of the momentum method, accelerated gradient and the stochastic variance reduced method (SVRG).
翻译:凸优化与非凸优化方法的收敛速度依赖于一系列常量的选择,包括步长、李雅普诺夫函数常数以及动量常数。本文提出将阶乘幂作为灵活工具,用于定义收敛性证明中出现的常量。我们列举了这些序列所具备的若干显著性质,并展示了如何将它们应用于收敛性证明中,从而简化或改进动量法、加速梯度法以及随机方差缩减方法(SVRG)的收敛速度。