We establish finite-time last-iterate guarantees for vanilla stochastic gradient descent in co-coercive games under noisy feedback. This is a broad class of games that is more general than strongly monotone games, allows for multiple Nash equilibria, and includes examples such as quadratic games with negative semidefinite interaction matrices and potential games with smooth concave potentials. Prior work in this setting has relied on relative noise models, where the noise vanishes as iterates approach equilibrium, an assumption that is often unrealistic in practice. We work instead under a substantially more general noise model in which the second moment of the noise is allowed to scale affinely with the squared norm of the iterates, an assumption natural in learning with unbounded action spaces. Under this model, we prove a last-iterate bound of order $O(\log(t)/t^{1/3})$, the first such bound for co-coercive games under non-vanishing noise. We additionally establish almost sure convergence of the iterates to the set of Nash equilibria and derive time-average convergence guarantees.
翻译:我们针对含噪反馈下的共协博弈,给出了标准随机梯度下降法的有限时间最后迭代保证。这类博弈涵盖范围广泛,不仅包含强单调博弈,还允许多个纳什均衡存在,并包括如具有负半定交互矩阵的二次博弈和具有平滑凹势的势博弈等实例。此前在该场景下的研究依赖于相对噪声模型(即噪声随迭代趋近均衡而消失),这种假设在实践中往往不现实。我们转而采用一种更具普适性的噪声模型,允许噪声的二阶矩与迭代点二范数的平方呈仿射关系,该假设自然适用于无界动作空间的学习场景。在此模型下,我们证明了阶为$O(\log(t)/t^{1/3})$的最后迭代界,这是非消失噪声下共协博弈的首个此类界。此外,我们还建立了迭代点序列几乎必然收敛到纳什均衡集的结论,并推导出时间平均收敛性保证。