We consider the problem of computing mixed Nash equilibria of two-player zero-sum games with continuous sets of pure strategies and with first-order access to the payoff function. This problem arises for example in game-theory-inspired machine learning applications, such as distributionally-robust learning. In those applications, the strategy sets are high-dimensional and thus methods based on discretisation cannot tractably return high-accuracy solutions. In this paper, we introduce and analyze a particle-based method that enjoys guaranteed local convergence for this problem. This method consists in parametrizing the mixed strategies as atomic measures and applying proximal point updates to both the atoms' weights and positions. It can be interpreted as a time-implicit discretization of the "interacting" Wasserstein-Fisher-Rao gradient flow. We prove that, under non-degeneracy assumptions, this method converges at an exponential rate to the exact mixed Nash equilibrium from any initialization satisfying a natural notion of closeness to optimality. We illustrate our results with numerical experiments and discuss applications to max-margin and distributionally-robust classification using two-layer neural networks, where our method has a natural interpretation as a simultaneous training of the network's weights and of the adversarial distribution.
翻译:我们考虑计算具有连续纯策略集且可一阶访问收益函数的两人零和博弈混合纳什均衡问题。该问题出现在博弈论启发的机器学习应用中,例如分布鲁棒学习。在这些应用中,策略集是高维的,因此基于离散化的方法无法高效返回高精度解。本文提出并分析了一种具有保证局部收敛性的粒子方法来解决该问题。该方法将混合策略参数化为原子测度,并对原子权重和位置应用邻近点更新。它可被解释为"相互作用"Wasserstein-Fisher-Rao梯度流的时间隐式离散化。我们证明,在非退化假设下,该方法从任意满足自然最优性紧密度概念的初始化出发,以指数速率收敛至精确混合纳什均衡。通过数值实验验证结果,并讨论其在双层神经网络最大间隔与分布鲁棒分类中的应用——该方法在该场景下可自然解释为网络权重与对抗分布的同步训练。