While Online Gradient Descent and other no-regret learning procedures are known to efficiently converge to coarse correlated equilibrium in games where each agent's utility is concave in their own strategy, this is not the case when the utilities are non-concave, a situation that is common in machine learning applications where the agents' strategies are parameterized by deep neural networks, or the agents' utilities are computed by a neural network, or both. Indeed, non-concave games present a host of game-theoretic and optimization challenges: (i) Nash equilibria may fail to exist; (ii) local Nash equilibria exist but are intractable; and (iii) mixed Nash, correlated, and coarse correlated equilibria have infinite support in general, and are intractable. To sidestep these challenges we propose a new solution concept, termed $(\varepsilon, \Phi(\delta))$-local equilibrium, which generalizes local Nash equilibrium in non-concave games, as well as (coarse) correlated equilibrium in concave games. Importantly, we show that two instantiations of this solution concept capture the convergence guarantees of Online Gradient Descent and no-regret learning, which we show efficiently converge to this type of equilibrium in non-concave games with smooth utilities.
翻译:尽管在线梯度下降及其他无遗憾学习过程已知能在每个智能体效用函数关于自身策略为凹的博弈中高效收敛到粗相关均衡,但当效用函数非凹时(这在机器学习应用中十分普遍,例如智能体策略由深度神经网络参数化、智能体效用由神经网络计算,或两种情况并存),情况则截然不同。事实上,非凹博弈带来了一系列博弈论与优化层面的挑战:(i) 纳什均衡可能不存在;(ii) 局部纳什均衡虽存在但难以处理;(iii) 混合纳什均衡、相关均衡及粗相关均衡通常具有无限支撑且难以求解。为规避这些困难,我们提出一种新的解概念,称为$(\varepsilon, \Phi(\delta))$-局部均衡,该概念不仅推广了非凹博弈中的局部纳什均衡,也推广了凹博弈中的(粗)相关均衡。重要的是,我们证明该解概念的两种具体形式能刻画在线梯度下降与无遗憾学习过程的收敛保证,并表明这些算法在光滑效用函数的非凹博弈中,能高效收敛至这类均衡。