By specifying behaviour across multiple agents, social norms are a coordination approach to resolving social dilemmas. Decentralized and wide adoption can be achieved by norms whose prescription involves interpreting stochastic signals in the environment. Such signals must have enough correlation to orchestrate mutually beneficial coordination and enough disincentivizing uncertainty about the benefits of exploiting that coordination. Evolutionary game theory of matrix games has been used to describe how, by rational agents comparing and adopting norms, a norm can evolve to become dominant in a population. Morsky \& Akçay (2019) classify norms according to a set of rationality criteria. Joint player strategies that adopt norms that are consistent with optimal single-player strategies with respect to expected reward naturally satisfy a correlated, rather than Nash game theoretic equilibrium condition. Here, we present a version of this theory that clarifies the basic ingredients. We formulate it in the more general Markov game setting more commonly used in reinforcement learning theory. We illustrate the theory by mapping norms over the signal and reward space, while also giving a detailed exposition of the underlying mechanics of the approach. Finally, we give a general solution and analysis of replicator dynamics, which Morsky \& Akçay (2019) propose as a means by which these norms could emerge.
翻译:通过规定多个主体的行为,社会规范是解决社会困境的一种协调方法。规范通过涉及解释环境中随机信号的规则,可实现去中心化且广泛的采纳。这些信号必须具有足够的相关性以协调互利合作,并具备足够的不确定性抑制来利用这种合作的好处。矩阵博弈的演化博弈论已被用于描述理性主体通过比较和采纳规范,如何使一种规范在群体中演变为主导。Morsky & Akçay (2019) 根据一组理性准则对社会规范进行分类。采用与预期奖励最优单主体策略一致的规范的联合主体策略,自然满足关联均衡而非纳什博弈论均衡条件。在此,我们提出该理论的一个版本,阐明其基本要素。我们将其形式化为更通用的马尔可夫博弈框架,该框架在强化学习理论中更为常用。我们通过将规范映射到信号和奖励空间来阐述该理论,同时详细解释该方法的基本机制。最后,我们对复制动力学给出一般性解和分析,Morsky & Akçay (2019) 提出该动力学作为这些规范可能涌现的方式。