This work introduces a unified framework for analyzing games in greater depth. In the existing literature, players' strategies are typically assigned scalar values, and equilibrium concepts are used to identify compatible choices. However, this approach neglects the internal structure of players, thereby failing to accurately model observed behaviors. To address this limitation, we propose an abstract definition of a player, consistent with constructions in reinforcement learning. Instead of defining games as external settings, our framework defines them in terms of the players themselves. This offers a language that enables a deeper connection between games and learning. To illustrate the need for this generality, we study a simple two-player game and show that even in basic settings, a sophisticated player may adopt dynamic strategies that cannot be captured by simpler models or compatibility analysis. For a general definition of a player, we discuss natural conditions on its components and define competition through their behavior. In the discrete setting, we consider players whose estimates largely follow the standard framework from the literature. We explore connections to correlated equilibrium and highlight that dynamic programming naturally applies to all estimates. In the mean-field setting, we exploit symmetry to construct explicit examples of equilibria. Finally, we conclude by examining relations to reinforcement learning.
翻译:本研究提出了一个用于深入分析博弈的统一框架。现有文献通常将参与者的策略分配标量值,并使用均衡概念来识别相容选择。然而,这种方法忽略了参与者的内部结构,因而无法准确建模观察到的行为。为克服这一局限,我们提出了与强化学习构造相一致的参与者抽象定义。我们的框架并非将博弈定义为外部环境,而是依据参与者自身来定义博弈。这提供了一种能够更深入连接博弈与学习的语言。为说明这种一般性定义的必要性,我们研究了一个简单的双参与者博弈,并证明即使在基本设定中,复杂参与者也可能采用动态策略,这些策略无法通过简单模型或相容性分析来捕捉。针对参与者的一般定义,我们讨论了其各组成部分的自然条件,并通过其行为定义竞争关系。在离散设定中,我们考察了其估计值基本遵循文献标准框架的参与者。我们探讨了与相关均衡的联系,并强调动态规划自然适用于所有估计过程。在平均场设定中,我们利用对称性构建了均衡的显式示例。最后,我们通过考察与强化学习的关系来总结全文。