We study the convergence properties and escape dynamics of Stochastic Gradient Descent (SGD) in one-dimensional landscapes, separately considering infinite- and finite-variance noise. Our main focus is to identify the time scales on which SGD reliably moves from an initial point to the local minimum in the same ''basin''. Under suitable conditions on the noise distribution, we prove that SGD converges to the basin's minimum unless the initial point lies too close to a local maximum. In that near-maximum scenario, we show that SGD can linger for a long time in its neighborhood. For initial points near a ''sharp'' maximum, we show that SGD does not remain stuck there, and we provide results to estimate the probability that it will reach each of the two neighboring minima. Overall, our findings present a nuanced view of SGD's transitions between local maxima and minima, influenced by both noise characteristics and the underlying function geometry.
翻译:我们研究了随机梯度下降(SGD)在一维景观中的收敛性质与逃逸动力学,分别考虑了无限方差与有限方差噪声。我们的主要目标是确定SGD从初始点可靠地移动到同一“盆地”内局部极小值所需的时间尺度。在噪声分布满足适当条件的前提下,我们证明,除非初始点过于接近局部极大值,否则SGD会收敛到该盆地的极小值。在初始点接近极大值的场景下,我们证明SGD可能会在其邻域内长时间徘徊。对于初始点接近“尖锐”极大值的情况,我们证明SGD不会持续停滞于此,并提供了估计其到达两个相邻极小值中任意一个的概率的结果。总体而言,我们的研究结果呈现了SGD在局部极大值与极小值之间过渡的细致图景,该过程同时受到噪声特性与底层函数几何结构的影响。