The behaviour of multi-agent learning in many player games has been shown to display complex dynamics outside of restrictive examples such as network zero-sum games. In addition, it has been shown that convergent behaviour is less likely to occur as the number of players increase. To make progress in resolving this problem, we study Q-Learning dynamics and determine a sufficient condition for the dynamics to converge to a unique equilibrium in any network game. We find that this condition depends on the nature of pairwise interactions and on the network structure, but is explicitly independent of the total number of agents in the game. We evaluate this result on a number of representative network games and show that, under suitable network conditions, stable learning dynamics can be achieved with an arbitrary number of agents.
翻译:多人博弈中的多智能体学习行为已被证明,在网络零和博弈等限制性示例之外,会呈现出复杂动态。此外,研究显示,随着智能体数量的增加,收敛行为出现的可能性降低。为解决这一问题,我们研究了Q学习动态,并确定了在任何网络博弈中该动态收敛至唯一均衡的一个充分条件。我们发现,该条件取决于成对相互作用的性质以及网络结构,但明确地与博弈中的智能体总数无关。我们在一系列代表性网络博弈中评估了这一结果,并表明在适当的网络条件下,可以实现任意数量智能体的稳定学习动态。