While Online Gradient Descent and other no-regret learning procedures are known to efficiently converge to a coarse correlated equilibrium in games where each agent's utility is concave in their own strategy, this is not the case when utilities are non-concave -- a common scenario in machine learning applications involving strategies parameterized by deep neural networks, or when agents' utilities are computed by neural networks, or both. Non-concave games introduce significant game-theoretic and optimization challenges: (i) Nash equilibria may not exist; (ii) local Nash equilibria, though existing, are intractable; and (iii) mixed Nash, correlated, and coarse correlated equilibria generally have infinite support and are intractable. To sidestep these challenges, we revisit the classical solution concept of $\Phi$-equilibria introduced by Greenwald and Jafari [2003], which is guaranteed to exist for an arbitrary set of strategy modifications $\Phi$ even in non-concave games [Stoltz and Lugosi, 2007]. However, the tractability of $\Phi$-equilibria in such games remains elusive. In this paper, we initiate the study of tractable $\Phi$-equilibria in non-concave games and examine several natural families of strategy modifications. We show that when $\Phi$ is finite, there exists an efficient uncoupled learning algorithm that converges to the corresponding $\Phi$-equilibria. Additionally, we explore cases where $\Phi$ is infinite but consists of local modifications, showing that Online Gradient Descent can efficiently approximate $\Phi$-equilibria in non-trivial regimes.
翻译:尽管在线梯度下降及其他无悔学习算法已知能在每个智能体的效用函数关于其自身策略为凹的博弈中高效收敛至粗相关均衡,但当效用函数非凹时——这一情形常见于涉及深度神经网络参数化策略的机器学习应用,或当智能体的效用由神经网络计算,或两者兼具时——情况并非如此。非凹博弈引入了显著的博弈论与优化挑战:(i) 纳什均衡可能不存在;(ii) 局部纳什均衡虽存在但难以求解;(iii) 混合纳什均衡、相关均衡及粗相关均衡通常具有无限支撑且难以处理。为规避这些挑战,我们重新审视由Greenwald与Jafari[2003]提出的经典解概念——$Φ$-均衡,该均衡对于任意策略修改集$Φ$均保证存在于非凹博弈中[Stoltz与Lugosi, 2007]。然而,此类博弈中$Φ$-均衡的可处理性仍不明确。本文首次系统研究非凹博弈中可处理的$Φ$-均衡,并考察若干自然的策略修改族。我们证明当$Φ$为有限集时,存在一种高效的非耦合学习算法能收敛至相应的$Φ$-均衡。此外,我们探讨$Φ$为无限但仅包含局部修改的情形,证明在线梯度下降算法能在非平凡机制中高效逼近$Φ$-均衡。