This paper considers the problem of asynchronous stochastic nonconvex optimization with heavy-tailed gradient noise and arbitrarily heterogeneous computation times across workers. We propose an asynchronous normalized stochastic gradient descent algorithm with momentum. The analysis show that our method achieves the optimal time complexity under the assumption of bounded $p$th-order central moment with $p\in(1,2]$. We also provide numerical experiments to show the effectiveness of proposed method.
翻译:本文研究了在梯度噪声具有重尾特性且各计算节点处理时间任意异构条件下的异步随机非凸优化问题。我们提出了一种带动量的异步归一化随机梯度下降算法。分析表明,在假设$p$阶中心矩有界(其中$p\in(1,2]$)的前提下,我们的方法达到了最优时间复杂度。我们还通过数值实验验证了所提方法的有效性。