We derive high-dimensional scaling limits and fluctuations for the online least-squares Stochastic Gradient Descent (SGD) algorithm by taking the properties of the data generating model explicitly into consideration. Our approach treats the SGD iterates as an interacting particle system, where the expected interaction is characterized by the covariance structure of the input. Assuming smoothness conditions on moments of order up to eight orders, and without explicitly assuming Gaussianity, we establish the high-dimensional scaling limits and fluctuations in the form of infinite-dimensional Ordinary Differential Equations (ODEs) or Stochastic Differential Equations (SDEs). Our results reveal a precise three-step phase transition of the iterates; it goes from being ballistic, to diffusive, and finally to purely random behavior, as the noise variance goes from low, to moderate and finally to very-high noise setting. In the low-noise setting, we further characterize the precise fluctuations of the (scaled) iterates as infinite-dimensional SDEs. We also show the existence and uniqueness of solutions to the derived limiting ODEs and SDEs. Our results have several applications, including characterization of the limiting mean-square estimation or prediction errors and their fluctuations which can be obtained by analytically or numerically solving the limiting equations.
翻译:我们通过显式考虑数据生成模型的属性,推导了在线最小二乘随机梯度下降(SGD)算法的高维标度极限与波动。该方法将SGD迭代视为一个相互作用粒子系统,其中期望相互作用由输入的协方差结构表征。在假设矩条件平滑至八阶且不显式假设高斯性的前提下,我们建立了高维标度极限与波动,其形式表现为无穷维常微分方程(ODE)或随机微分方程(SDE)。我们的结果揭示了迭代过程中精确的三阶段相变:随着噪声方差从低噪声、中等噪声到极高噪声设置,迭代行为依次经历弹道型、扩散型,最终呈现纯随机行为。在低噪声设置中,我们进一步将(标度化)迭代的精确波动刻画为无穷维SDE。此外,我们还证明了所推导的极限ODE和SDE解的存在唯一性。该结果具有多项应用价值,包括通过解析或数值求解极限方程,可获取极限均方估计/预测误差及其波动的表征。