This work provides a unified framework for exploring games. In existing literature, strategies of players are typically assigned scalar values, and the concept of Nash equilibrium is used to identify compatible strategies. However, this approach lacks the internal structure of a player, thereby failing to accurately model observed behaviors in reality. To address this limitation, we propose to characterize players by their learning algorithms, and as their estimations intrinsically induce a distribution over strategies, we introduced the notion of equilibrium in terms of characterizing the recurrent behaviors of the learning algorithms. This approach allows for a more nuanced understanding of players, and brings the focus to the challenge of learning that players face. While our explorations in discrete games, mean-field games, and reinforcement learning demonstrate the framework's broad applicability, they also set the stage for future research aimed at specific applications.
翻译:本研究提出了一个统一的博弈探索框架。现有文献通常将玩家策略赋予标量值,并使用纳什均衡概念来识别兼容策略。然而,这种方法缺乏对玩家内部结构的刻画,因而无法准确模拟现实中观察到的行为。为克服这一局限,我们提出通过玩家的学习算法来刻画其特征,由于算法估计值本质上会诱导出策略上的概率分布,我们引入了基于学习算法循环行为的均衡概念。这种方法能够更细致地理解玩家行为,并将研究重点转向玩家面临的学习挑战。我们在离散博弈、平均场博弈和强化学习领域的探索不仅证明了该框架的广泛适用性,也为未来针对具体应用的研究奠定了基础。