Randomized-subspace methods reduce the cost of first-order optimization by using only low-dimensional projected-gradient information, a feature that is attractive in forward-mode automatic differentiation and communication-limited settings. While Nesterov acceleration is well understood for full-gradient and coordinate-based methods, obtaining accelerated methods for general subspace sketches that use only projected-gradient information and can improve over full-dimensional Nesterov acceleration in oracle complexity is technically nontrivial. We develop randomized-subspace Nesterov accelerated gradient methods for smooth convex and smooth strongly convex optimization under matrix smoothness and generic sketch moment assumptions. The key technical ingredient is a three-sequence formulation tailored to matrix smoothness, which recovers the corresponding classical Nesterov methods in the full-dimensional case. The resulting theory establishes accelerated oracle-complexity guarantees and makes explicit how matrix smoothness and the sketch distribution enter the complexity. It also provides a unified basis for comparing sketch families and identifying when randomized-subspace acceleration improves over full-dimensional Nesterov acceleration in oracle complexity.
翻译:随机子空间方法通过仅使用低维投影梯度信息来降低一阶优化的计算成本,这一特性在前向模式自动微分和通信受限场景中具有吸引力。尽管Nesterov加速在全梯度和基于坐标的方法中已被充分理解,但对于仅使用投影梯度信息且可能在全维度Nesterov加速的预言机复杂度上取得改进的一般性子空间草图方法,实现加速在技术层面存在显著困难。我们针对光滑凸优化和光滑强凸优化问题,在矩阵光滑性及通用草图矩假设下,提出了随机子空间Nesterov加速梯度方法。关键技术要素是针对矩阵光滑性定制的三序列公式,该公式在满维度情形下可恢复对应的经典Nesterov方法。所建立的理论给出了加速预言机复杂度的保证,并明确揭示了矩阵光滑性与草图分布如何影响复杂度。该理论还为不同草图系列的对比提供了统一框架,并指明了在何种条件下随机子空间加速能在预言机复杂度上优于全维度Nesterov加速。