We initiate the study of repeated game dynamics in the population model, in which we are given a population of $n$ nodes, each with its local strategy, which interact uniformly at random by playing multi-round, two-player games. After each game, the two participants receive rewards according to a given payoff matrix, and may update their local strategies depending on this outcome. In this setting, we ask how the distribution of player strategies evolves with respect to the number of node interactions (time complexity), as well as the number of possible player states (space complexity), determining the stationary properties of such game dynamics. Our main technical results analyze the behavior of a family of Repeated Prisoner's Dilemma dynamics in this model, for which we provide an exact characterization of the stationary distribution, and give bounds on convergence time and on the optimality gap of its expected rewards. Our results follow from a new connection between Repeated Prisoner's Dilemma dynamics in a population, and a class of high-dimensional, weighted Ehrenfest random walks, which we analyze for the first time. The results highlight non-trivial trade-offs between the state complexity of each node's strategy, the convergence of the process, and the expected average reward of nodes in the population. Our approach opens the door towards the characterization of other natural evolutionary game dynamics in the population model.
翻译:我们开创性地研究了群体模型中重复博弈的动力学问题。在该模型中,给定一个由$n$个节点组成的群体,每个节点拥有其局部策略,节点间通过均匀随机配对进行多轮双人博弈交互。每轮博弈结束后,两位参与者根据给定的收益矩阵获得奖励,并可能根据博弈结果更新其局部策略。在这一设定下,我们探究玩家策略分布如何随节点交互次数(时间复杂度)及可能的玩家状态数量(空间复杂度)演化,并确定此类博弈动力学的平稳性质。我们的主要技术成果分析了该模型中一系列重复囚徒困境动力学的行为,给出了其平稳分布的精确刻画,并给出了收敛时间及其期望奖励最优性差距的界。这些成果源于我们首次建立的群体重复囚徒困境动力学与一类高维加权厄伦费斯特随机游走之间的新关联。研究结果揭示了节点策略状态复杂度、过程收敛性以及群体节点期望平均奖励之间的非平凡权衡关系。我们的方法为刻画群体模型中其他自然演化博弈动力学开辟了道路。