We conduct a comprehensive analysis of the discrete-time exponential-weights dynamic with a constant step size on all general-sum and symmetric $2 \times 2$ normal-form games, i.e. games with $2$ pure strategies per player, and where the ensuing payoff tuple is of the form $(A,A^\top)$ (where $A$ is the $2 \times 2$ payoff matrix corresponding to the first player). Such symmetric games commonly arise in real-world interactions between 'symmetric" agents who have identically defined utility functions -- such as Bertrand competition and multi-agent performative prediction, and display a rich multiplicity of equilibria despite the seemingly simple setting. Somewhat surprisingly, we show through a first-principles analysis that the exponential weights dynamic, which is popular in online learning, converges in the last iterate for such games regardless of initialization with an appropriately chosen step size. For certain games and/or initializations, we further show that the convergence rate is in fact exponential and holds for any step size. We illustrate our theory with extensive simulations and applications to the aforementioned game-theoretic interactions. In the case of multi-agent performative prediction, we formulate a new "mortgage competition" game between lenders (i.e. banks) who interact with a population of customers, and show that it fits into our framework.
翻译:我们对具有恒定步长的离散时间指数权重动态在所有一般和且对称的 $2 \times 2$ 标准型博弈(即每位参与者有 $2$ 个纯策略,且随之产生的收益元组形式为 $(A,A^\top)$ 的博弈,其中 $A$ 是第一位参与者对应的 $2 \times 2$ 收益矩阵)上进行了全面分析。此类对称博弈通常出现在具有相同定义效用函数的“对称”智能体之间的现实世界交互中——例如伯特兰竞争和多智能体执行预测,尽管设定看似简单,却展现出丰富的均衡多样性。有些令人惊讶的是,我们通过基本原理分析表明,在线学习中流行的指数权重动态对于此类博弈,在适当选择步长的情况下,无论初始化如何,均能在末次迭代中收敛。对于某些博弈和/或初始化,我们进一步证明收敛速率实际上是指数级的,并且对任何步长都成立。我们通过大量仿真以及对前述博弈论交互的应用来阐述我们的理论。在多智能体执行预测的案例中,我们构建了一个与客户群体交互的贷款方(即银行)之间的新型“抵押贷款竞争”博弈,并证明其符合我们的框架。