Stochastic optimization methods have been hugely successful in making large-scale optimization problems feasible when computing the full gradient is computationally prohibitive. Using the theory of modified equations for numerical integrators, we propose a class of stochastic differential equations that approximate the dynamics of general stochastic optimization methods more closely than the original gradient flow. Analyzing a modified stochastic differential equation can reveal qualitative insights about the associated optimization method. Here, we study mean-square stability of the modified equation in the case of stochastic coordinate descent.
翻译:随机优化方法在全梯度计算成本高昂的大型优化问题中展现了巨大成功。基于数值积分器的修正方程理论,本文提出了一类随机微分方程,相比原始梯度流能更精确地逼近一般随机优化方法的动力学行为。通过分析修正随机微分方程,可揭示相关优化方法的定性特性。本文以随机坐标下降法为例,重点研究了修正方程的均方稳定性。