The replicator equation in evolutionary game theory describes the change in a population's behaviors over time given suitable incentives. It arises when individuals make decisions using a simple learning process - imitation. A recent emerging framework builds upon this standard model by incorporating game-environment feedback, in which the population's actions affect a shared environment, and in turn, the changing environment shapes incentives for future behaviors. In this paper, we investigate game-environment feedback when individuals instead use a boundedly rational learning rule known as logit learning. We characterize the resulting system's complete set of fixed points and their local stability properties, and how the level of rationality determines overall environmental outcomes in comparison to imitative learning rules. We identify a large parameter space for which logit learning exhibits a wide range of dynamics as the rationality parameter is increased from low to high. Notably, we identify a bifurcation point at which the system exhibits stable limit cycles. When the population is highly rational, the limit cycle collapses and a tragedy of the commons becomes stable.
翻译:演化博弈论中的复制方程描述了在适当激励下群体行为随时间的变化。该方程源于个体采用简单学习过程——模仿——进行决策。一个新兴的框架通过纳入博弈-环境反馈扩展了这一标准模型,其中群体的行为影响共享环境,而环境的变化又反向塑造未来行为的激励。本文研究了当个体采用有限理性学习规则(称为Logit学习)时的博弈-环境反馈。我们刻画了所得系统的全部不动点及其局部稳定性,并比较了理性水平与模仿学习规则相比如何决定整体环境结果。我们发现,当理性参数从低到高增加时,Logit学习在一个大参数空间中展现出丰富的动力学行为。值得注意的是,我们识别出一个分岔点,在该点系统出现稳定极限环。当群体高度理性时,极限环消失,公地悲剧变得稳定。