In this work, we study the asymptotic randomness of an algorithmic estimator of the saddle point of a globally convex-concave and locally strongly-convex strongly-concave objective. Specifically, we show that the averaged iterates of a Stochastic Extra-Gradient (SEG) method for a Saddle Point Problem (SPP) converges almost surely to the saddle point and follows a Central Limit Theorem (CLT) with optimal covariance under martingale-difference noise and the state(decision)-dependent Markov noise. To ensure the stability of the algorithm dynamics under the state-dependent Markov noise, we propose a variant of SEG with truncated varying sets. Interestingly, we show that a state-dependent Markovian data sequence can cause Stochastic Gradient Descent Ascent (SGDA) to diverge even if the target objective is strongly-convex strongly-concave. The main novelty of this work is establishing a CLT for SEG for a stochastic SPP, especially under sate-dependent Markov noise. This is the first step towards online inference of SPP with numerous potential applications including games, robust strategic classification, and reinforcement learning. We illustrate our results through numerical experiments.
翻译:本文研究全局凸-凹且局部强凸-强凹目标函数鞍点算法估计的渐近随机性。具体而言,我们证明随机额外梯度方法求解鞍点问题的平均迭代在鞅差噪声和状态相关马尔可夫噪声条件下几乎必然收敛至鞍点,并服从具有最优协方差的中心极限定理。为确保状态相关马尔可夫噪声下算法动态的稳定性,我们提出一种带截断变域集的改进型随机额外梯度方法。有趣的是,我们证明状态相关马尔可夫数据序列可能导致随机梯度下降上升算法即使对强凸-强凹目标函数也会发散。本文主要创新在于建立随机鞍点问题(特别在状态相关马尔可夫噪声下)随机额外梯度法的中心极限定理,这是实现鞍点问题在线推断的第一步,具有博弈论、鲁棒战略分类和强化学习等广泛潜在应用。我们通过数值实验验证了结论。