We present a generalization of Nesterov's accelerated gradient descent algorithm. Our algorithm (AGNES) provably achieves acceleration for smooth convex and strongly convex minimization tasks with noisy gradient estimates if the noise intensity is proportional to the magnitude of the gradient at every point. Nesterov's method converges at an accelerated rate if the constant of proportionality is below 1, while AGNES accommodates any signal-to-noise ratio. The noise model is motivated by applications in overparametrized machine learning. AGNES requires only two parameters in convex and three in strongly convex minimization tasks, improving on existing methods. We further provide clear geometric interpretations and heuristics for the choice of parameters.
翻译:我们提出了一种Nesterov加速梯度下降算法的推广形式。我们的算法(AGNES)在理论上证明,当噪声强度与每点梯度幅值成比例时,能够在含噪声梯度估计的光滑凸函数及强凸函数最小化任务中实现加速收敛。若比例常数小于1,Nesterov方法能以加速速率收敛;而AGNES算法可适应任意信噪比条件。该噪声模型受到过参数化机器学习应用的启发。AGNES在凸函数最小化任务中仅需两个参数,在强凸函数任务中仅需三个参数,较现有方法有所改进。我们进一步为参数选择提供了清晰的几何解释与启发式策略。