We investigate stochastic gradient methods and stochastic counterparts of the Barzilai-Borwein steplengths and their application to finite-sum minimization problems. Our proposal is based on the Trust-Region-ish (TRish) framework introduced in [F. E. Curtis, K. Scheinberg, R. Shi, A stochastic trust region algorithm based on careful step normalization, Informs Journal on Optimization, 1, 2019]. The new framework, named TRishBB, aims at enhancing the performance of TRish and at reducing the computational cost of the second-order TRish variant. We propose three different methods belonging to the TRishBB framework and present the convergence analysis for possibly nonconvex objective functions, considering biased and unbiased gradient approximations. Our analysis requires neither diminishing step-sizes nor full gradient evaluation. The numerical experiments in machine learning applications demonstrate the effectiveness of applying the Barzilai-Borwein steplength with stochastic gradients and show improved testing accuracy compared to the TRish method.
翻译:本文研究了随机梯度方法以及Barzilai-Borwein步长的随机对应形式,及其在有限和最小化问题中的应用。我们的方案基于[F. E. Curtis, K. Scheinberg, R. Shi, A stochastic trust region algorithm based on careful step normalization, Informs Journal on Optimization, 1, 2019]中提出的类信赖域(TRish)框架。新框架命名为TRishBB,旨在提升TRish方法的性能,并降低其二阶TRish变体的计算成本。我们提出了属于TRishBB框架的三种不同方法,并针对可能非凸的目标函数,在考虑有偏和无偏梯度近似的条件下给出了收敛性分析。我们的分析既不要求递减步长,也不要求完整梯度计算。在机器学习应用中的数值实验证明了将Barzilai-Borwein步长与随机梯度结合的有效性,并显示出相较于TRish方法更高的测试精度。