多人随机图博弈中的ε-平稳纳什均衡 (ε-Stationary Nash Equilibria in Multi-player Stochastic Graph Games)

A strategy profile in a multi-player game is a Nash equilibrium if no player can unilaterally deviate to achieve a strictly better payoff. A profile is an $ε$-Nash equilibrium if no player can gain more than $ε$ by unilaterally deviating from their strategy. In this work, we use $ε$-Nash equilibria to approximate the computation of Nash equilibria. Specifically, we focus on turn-based, multiplayer stochastic games played on graphs, where players are restricted to stationary strategies -- strategies that use randomness but not memory. The problem of deciding the constrained existence of stationary Nash equilibria -- where each player's payoff must lie within a given interval -- is known to be $\exists\mathbb{R}$-complete in such a setting (Hansen and Sølvsten, 2020). We extend this line of work to stationary $ε$-Nash equilibria and present an algorithm that solves the following promise problem: given a game with a Nash equilibrium satisfying the constraints, compute an $ε$-Nash equilibrium that $ε$-satisfies those same constraints -- satisfies the constraints up to an $ε$ additive error. Our algorithm runs in FNP^NP time. To achieve this, we first show that if a constrained Nash equilibrium exists, then one exists where the non-zero probabilities are at least an inverse of a double-exponential in the input. We further prove that such a strategy can be encoded using floating-point representations, as in the work of Frederiksen and Miltersen (2013), which finally gives us our FNP^NP algorithm. We further show that the decision version of the promise problem is NP-hard. Finally, we show a partial tightness result by proving a lower bound for such techniques: if a constrained Nash equilibrium exists, then there must be one that where the probabilities in the strategies are double-exponentially small.

翻译：在多人博弈中，若没有参与者能通过单方面偏离策略获得严格更高的收益，则该策略组合构成纳什均衡。若没有参与者能通过单方面偏离策略获得超过ε的额外收益，则该组合构成ε-纳什均衡。本文利用ε-纳什均衡来近似计算纳什均衡。具体而言，我们关注基于图的回合制多人随机博弈，其中参与者被限制使用平稳策略——即仅使用随机性而不依赖记忆的策略。在此类设定中，判定满足约束条件的平稳纳什均衡（要求每位参与者的收益位于给定区间内）的存在性问题已被证明是∃ℝ完全的（Hansen与Sølvsten，2020）。我们将这一研究方向拓展至平稳ε-纳什均衡，并提出一种算法用于解决以下承诺问题：给定一个存在满足约束条件的纳什均衡的博弈，计算一个ε-满足相同约束条件（在ε加性误差范围内满足约束）的ε-纳什均衡。该算法在FNP^NP时间内运行。为实现这一目标，我们首先证明：若存在满足约束的纳什均衡，则必然存在一个非零概率至少为输入规模的双指数倒数形式的均衡。我们进一步证明，此类策略可采用浮点表示进行编码（如Frederiksen与Miltersen于2013年的工作所述），最终得到FNP^NP算法。我们还证明该承诺问题的判定版本具有NP难度。最后，我们通过证明此类技术的下界得到部分紧致性结果：若存在满足约束的纳什均衡，则必然存在策略中概率为双指数级小的均衡。