Sign stochastic gradient descent (signSGD) is a communication-efficient method that transmits only the sign of stochastic gradients for parameter updating. Existing literature has demonstrated that signSGD can achieve a convergence rate of $\mathcal{O}(d^{1/2}T^{-1/4})$, where $d$ represents the dimension and $T$ is the iteration number. In this paper, we improve this convergence rate to $\mathcal{O}(d^{1/2}T^{-1/3})$ by introducing the Sign-based Stochastic Variance Reduction (SSVR) method, which employs variance reduction estimators to track gradients and leverages their signs to update. For finite-sum problems, our method can be further enhanced to achieve a convergence rate of $\mathcal{O}(m^{1/4}d^{1/2}T^{-1/2})$, where $m$ denotes the number of component functions. Furthermore, we investigate the heterogeneous majority vote in distributed settings and introduce two novel algorithms that attain improved convergence rates of $\mathcal{O}(d^{1/2}T^{-1/2} + dn^{-1/2})$ and $\mathcal{O}(d^{1/4}T^{-1/4})$ respectively, outperforming the previous results of $\mathcal{O}(dT^{-1/4} + dn^{-1/2})$ and $\mathcal{O}(d^{3/8}T^{-1/8})$, where $n$ represents the number of nodes. Numerical experiments across different tasks validate the effectiveness of our proposed methods.
翻译:符号随机梯度下降(signSGD)是一种通信高效的方法,它仅传输随机梯度的符号用于参数更新。现有文献表明,signSGD可以实现$\mathcal{O}(d^{1/2}T^{-1/4})$的收敛速率,其中$d$表示维度,$T$为迭代次数。本文通过引入基于符号的随机方差缩减(SSVR)方法,将收敛速率提升至$\mathcal{O}(d^{1/2}T^{-1/3})$;该方法采用方差缩减估计器追踪梯度,并利用其符号进行更新。对于有限求和问题,我们的方法可进一步优化至$\mathcal{O}(m^{1/4}d^{1/2}T^{-1/2})$的收敛速率,其中$m$表示分量函数的数量。此外,我们研究了分布式环境中的异构多数投票机制,提出了两种新算法,分别实现了$\mathcal{O}(d^{1/2}T^{-1/2} + dn^{-1/2})$和$\mathcal{O}(d^{1/4}T^{-1/4})$的收敛速率,优于先前$\mathcal{O}(dT^{-1/4} + dn^{-1/2})$和$\mathcal{O}(d^{3/8}T^{-1/8})$的结果,其中$n$代表节点数量。在不同任务上的数值实验验证了我们所提方法的有效性。