We study a kind of new SDE that was arisen from the research on optimization in machine learning, we call it power-law dynamic because its stationary distribution cannot have sub-Gaussian tail and obeys power-law. We prove that the power-law dynamic is ergodic with unique stationary distribution, provided the learning rate is small enough. We investigate its first exist time. In particular, we compare the exit times of the (continuous) power-law dynamic and its discretization. The comparison can help guide machine learning algorithm.
翻译:我们研究了一类源自机器学习优化研究的新型随机微分方程,将其称为幂律动力学,因为其平稳分布不能具有次高斯尾部且服从幂律分布。我们证明,当学习率足够小时,幂律动力学具有唯一平稳分布且是遍历的。我们研究了其首次退出时间。特别地,我们比较了(连续)幂律动力学及其离散化的退出时间。这种比较有助于指导机器学习算法。