This paper proposes a new framework to study multi-agent interaction in Markov games: Markov $\alpha$-potential games. Markov potential games are special cases of Markov $\alpha$-potential games, so are two important and practically significant classes of games: Markov congestion games and perturbed Markov team games. In this paper, {$\alpha$-potential} functions for both games are provided and the gap $\alpha$ is characterized with respect to game parameters. Two algorithms -- the projected gradient-ascent algorithm and the sequential maximum improvement smoothed best response dynamics -- are introduced for approximating the stationary Nash equilibrium in Markov $\alpha$-potential games. The Nash-regret for each algorithm is shown to scale sub-linearly in time horizon. Our analysis and numerical experiments demonstrates that simple algorithms are capable of finding approximate equilibrium in Markov $\alpha$-potential games.
翻译:本文提出了一种研究马尔可夫博弈中多智能体交互的新框架:马尔可夫$α$-势博弈。马尔可夫势博弈是马尔可夫$α$-势博弈的特例,而两类在实践中重要且具有显著意义的博弈——马尔可夫拥塞博弈和扰动马尔可夫团队博弈——同样属于此类特例。本文为这两类博弈分别构建了$α$-势函数,并刻画了间隙α与博弈参数之间的关系。我们引入了两种算法——投影梯度上升算法与序贯最大改进平滑最优反应动力学——用于逼近马尔可夫$α$-势博弈中的稳态纳什均衡。分析表明,每种算法的纳什遗憾量随时间范围呈亚线性增长。本文的理论分析与数值实验证明,简单算法能够有效求解马尔可夫$α$-势博弈的近似均衡。