We explore the behaviour emerging from learning agents repeatedly interacting strategically for a wide range of learning dynamics that includes projected gradient, replicator and log-barrier dynamics. Going beyond the better-understood classes of potential games and zero-sum games, we consider the setting of a general repeated game with finite recall, for different forms of monitoring. We obtain a Folk Theorem-like result and characterise the set of payoff vectors that can be obtained by these dynamics, discovering a wide range of possibilities for the emergence of algorithmic collusion.
翻译:我们研究了在广泛的学习动态(包括投影梯度动态、复制动态和对数障碍动态)下,智能体重复进行策略性交互所涌现出的行为。超越已有较好理解的势博弈和零和博弈类别,我们考虑具有有限记忆的一般重复博弈场景,并针对不同形式的监控模式进行分析。我们得到了一个类民间定理的结果,并刻画了这些动态所能实现的收益向量集合,揭示了算法合谋涌现的广泛可能性。