We study stochastic Cubic Newton methods for solving general possibly non-convex minimization problems. We propose a new framework, which we call the helper framework, that provides a unified view of the stochastic and variance-reduced second-order algorithms equipped with global complexity guarantees. It can also be applied to learning with auxiliary information. Our helper framework offers the algorithm designer high flexibility for constructing and analyzing the stochastic Cubic Newton methods, allowing arbitrary size batches, and the use of noisy and possibly biased estimates of the gradients and Hessians, incorporating both the variance reduction and the lazy Hessian updates. We recover the best-known complexities for the stochastic and variance-reduced Cubic Newton, under weak assumptions on the noise. A direct consequence of our theory is the new lazy stochastic second-order method, which significantly improves the arithmetic complexity for large dimension problems. We also establish complexity bounds for the classes of gradient-dominated objectives, that include convex and strongly convex problems. For Auxiliary Learning, we show that using a helper (auxiliary function) can outperform training alone if a given similarity measure is small.
翻译:我们研究求解一般可能非凸最小化问题的随机三次牛顿方法。我们提出一种新框架,称为辅助框架,它为配备全局复杂度保证的随机和方差缩减二阶算法提供了统一视角。该框架还可应用于具有辅助信息的学习。我们的辅助框架为算法设计者提供了构建和分析随机三次牛顿方法的高度灵活性,允许任意批量大小,并使用带有噪声且可能存在偏差的梯度和海森矩阵估计,同时融合方差缩减与惰性海森矩阵更新。在弱噪声假设下,我们恢复了随机和方差缩减三次牛顿方法的最佳已知复杂度。我们理论的直接结果是提出一种新的惰性随机二阶方法,显著降低了高维问题的算术复杂度。我们还为包含凸和强凸问题的梯度主导目标函数类建立了复杂度界限。对于辅助学习,我们证明若给定相似性度量较小,使用辅助函数可优于单独训练。