We study the asymptotic shape of the trajectory of the stochastic gradient descent algorithm applied to a convex objective function. Under mild regularity assumptions, we prove a functional central limit theorem for the properly rescaled trajectory. Our result characterizes the long-term fluctuations of the algorithm around the minimizer by providing a diffusion limit for the trajectory. In contrast with classical central limit theorems for the last iterate or Polyak-Ruppert averages, this functional result captures the temporal structure of the fluctuations and applies to non-smooth settings such as robust location estimation, including the geometric median.
翻译:本文研究了随机梯度下降算法应用于凸目标函数时轨迹的渐近形态。在温和的正则性假设下,我们证明了经适当重标度后的轨迹满足泛函中心极限定理。该结果通过给出轨迹的扩散极限,刻画了算法在极小值点附近的长时波动特性。与针对最终迭代或Polyak-Ruppert平均值的经典中心极限定理不同,这一泛函结果捕捉了波动的时序结构,并适用于非光滑场景,例如包含几何中位数的鲁棒位置估计问题。