We study the game modification problem, where a benevolent game designer or a malevolent adversary modifies the reward function of a zero-sum Markov game so that a target deterministic or stochastic policy profile becomes the unique Markov perfect Nash equilibrium and has a value within a target range, in a way that minimizes the modification cost. We characterize the set of policy profiles that can be installed as the unique equilibrium of a game and establish sufficient and necessary conditions for successful installation. We propose an efficient algorithm that solves a convex optimization problem with linear constraints and then performs random perturbation to obtain a modification plan with a near-optimal cost. The code for our algorithm is available at https://github.com/YoungWu559/game-modification .
翻译:我们研究博弈修改问题,其中仁慈的博弈设计者或恶意的对手通过修改零和马尔可夫博弈的奖励函数,使得目标确定性或随机策略组合成为唯一的马尔可夫完美纳什均衡,且其价值处于目标范围内,同时最小化修改成本。我们刻画了可作为博弈唯一均衡安装的策略组合集合,并建立了成功安装的充分必要条件。我们提出一种高效算法,该算法通过求解具有线性约束的凸优化问题,并进行随机扰动,以获得具有近似最优成本的修改方案。算法代码可在 https://github.com/YoungWu559/game-modification 获取。