In this study, we demonstrate that the norm test and inner product/orthogonality test presented in \cite{Bol18} are equivalent in terms of the convergence rates associated with Stochastic Gradient Descent (SGD) methods if $\epsilon^2=\theta^2+\nu^2$ with specific choices of $\theta$ and $\nu$. Here, $\epsilon$ controls the relative statistical error of the norm of the gradient while $\theta$ and $\nu$ control the relative statistical error of the gradient in the direction of the gradient and in the direction orthogonal to the gradient, respectively. Furthermore, we demonstrate that the inner product/orthogonality test can be as inexpensive as the norm test in the best case scenario if $\theta$ and $\nu$ are optimally selected, but the inner product/orthogonality test will never be more computationally affordable than the norm test if $\epsilon^2=\theta^2+\nu^2$. Finally, we present two stochastic optimization problems to illustrate our results.
翻译:本研究证明,在满足 $\epsilon^2=\theta^2+\nu^2$ 且 $\theta$ 与 $\nu$ 取特定值的条件下,文献 \cite{Bol18} 提出的范数检验与内积/正交性检验在随机梯度下降(SGD)方法的收敛速率方面具有等价性。其中,$\epsilon$ 控制梯度范数的相对统计误差,而 $\theta$ 与 $\nu$ 分别控制梯度沿梯度方向及正交方向的相对统计误差。进一步研究表明,当 $\theta$ 与 $\nu$ 达到最优选取时,内积/正交性检验在最佳情形下的计算成本可与范数检验相当;但在 $\epsilon^2=\theta^2+\nu^2$ 的条件下,内积/正交性检验的计算复杂度始终不会低于范数检验。最后,我们通过两个随机优化问题验证了上述结论。