This paper proposes a new framework to study multi-agent interaction in Markov games: Markov $\alpha$-potential games. Markov potential games are special cases of Markov $\alpha$-potential games, so are two important and practically significant classes of games: Markov congestion games and perturbed Markov team games. In this paper, {$\alpha$-potential} functions for both games are provided and the gap $\alpha$ is characterized with respect to game parameters. Two algorithms -- the projected gradient-ascent algorithm and the sequential maximum improvement smoothed best response dynamics -- are introduced for approximating the stationary Nash equilibrium in Markov $\alpha$-potential games. The Nash-regret for each algorithm is shown to scale sub-linearly in time horizon. Our analysis and numerical experiments demonstrates that simple algorithms are capable of finding approximate equilibrium in Markov $\alpha$-potential games.
翻译:本文提出了一种研究马尔可夫博弈中多智能体交互的新框架:马尔可夫 $\alpha$-势博弈。马尔可夫势博弈是马尔可夫 $\alpha$-势博弈的特例,两类重要且具有实际意义的博弈——马尔可夫拥塞博弈和摄动马尔可夫团队博弈——同样如此。本文为这两类博弈提供了 $\alpha$-势函数,并刻画了与博弈参数相关的差距 $\alpha$。我们引入了两种算法——投影梯度上升算法和序贯最大改进平滑最佳响应动力学——用于逼近马尔可夫 $\alpha$-势博弈中的平稳纳什均衡。每种算法的纳什遗憾被证明随时间范围次线性增长。我们的分析和数值实验表明,简单算法能够在马尔可夫 $\alpha$-势博弈中找到近似均衡。