Repeated games are a framework for investigating long-term interdependence of multi-agent systems. In repeated games, zero-determinant (ZD) strategies attract much attention in evolutionary game theory, since they can unilaterally control payoffs. Especially, fair ZD strategies unilaterally equalize the payoff of the focal player and the average payoff of the opponents, and they were found in several games including the social dilemma games. Although the existence condition of ZD strategies in repeated games was specified, its extension to stochastic games is almost unclear. Stochastic games are an extension of repeated games, where a state of an environment exists, and the state changes to another one according to an action profile of players. Because of the transition of an environmental state, the existence condition of ZD strategies in stochastic games is more complicated than that in repeated games. Here, we investigate the existence condition of fair ZD strategies in the periodic prisoner's dilemma game, which is one of the simplest stochastic games. We show that fair ZD strategies do not necessarily exist in the periodic prisoner's dilemma game, in contrast to the repeated prisoner's dilemma game. Furthermore, we also prove that the Tit-for-Tat strategy, which imitates the opponent's action, is not necessarily a fair ZD strategy in the periodic prisoner's dilemma game, whereas the Tit-for-Tat strategy is always a fair ZD strategy in the repeated prisoner's dilemma game. Our results highlight difference between ZD strategies in the periodic prisoner's dilemma game and ones in the standard repeated prisoner's dilemma game.
翻译:重复博弈是研究多智能体系统长期相互依存关系的框架。在重复博弈中,零行列式(ZD)策略因能单方面控制收益而受到演化博弈理论的广泛关注。特别是公平的ZD策略能单方面使焦点玩家的收益与对手的平均收益相等化,此类策略已在包括社会困境博弈在内的多种博弈中被发现。尽管重复博弈中ZD策略的存在条件已明确,但其在随机博弈中的推广仍几乎未知。随机博弈是重复博弈的推广形式,其中存在环境状态,且状态会根据玩家的行动组合发生转移。由于环境状态的转移,随机博弈中ZD策略的存在条件比重复博弈更为复杂。本文研究最简单随机博弈之一——周期囚徒困境博弈中公平ZD策略的存在条件。我们证明与重复囚徒困境博弈不同,周期囚徒困境博弈中并不必然存在公平ZD策略。此外,我们还证明了模仿对手行为的"以牙还牙"策略在周期囚徒困境博弈中不必然是公平ZD策略,而在重复囚徒困境博弈中该策略始终是公平ZD策略。我们的研究结果凸显了周期囚徒困境博弈与标准重复囚徒困境博弈中ZD策略的差异。