Stochastic gradient descent with momentum is a popular variant of stochastic gradient descent, which has recently been reported to have a close relationship with the underdamped Langevin diffusion. In this paper, we establish a quantitative error estimate between them in the 1-Wasserstein and total variation distances.
翻译:动量随机梯度下降是随机梯度下降的一种流行变体,近期研究表明其与欠阻尼朗之万扩散存在紧密联系。本文在1-Wasserstein距离和全变差距离下,建立了二者之间的定量误差估计。