Behavioral experiments on the trust game have shown that trust and trustworthiness are universal among human beings, contradicting the prediction by assuming \emph{Homo economicus} in orthodox Economics. This means some mechanism must be at work that favors their emergence. Most previous explanations however need to resort to some factors based upon imitative learning, a simple version of social learning. Here, we turn to the paradigm of reinforcement learning, where individuals update their strategies by evaluating the long-term return through accumulated experience. Specifically, we investigate the trust game with the Q-learning algorithm, where each participant is associated with two evolving Q-tables that guide one's decision making as trustor and trustee respectively. In the pairwise scenario, we reveal that high levels of trust and trustworthiness emerge when individuals appreciate both their historical experience and returns in the future. Mechanistically, the evolution of the Q-tables shows a crossover that resembles human's psychological changes. We also provide the phase diagram for the game parameters, where the boundary analysis is conducted. These findings are robust when the scenario is extended to a latticed population. Our results thus provide a natural explanation for the emergence of trust and trustworthiness without external factors involved. More importantly, the proposed paradigm shows the potential in deciphering many puzzles in human behaviors.
翻译:信任博弈的行为实验表明,信任与可信赖性在人类中具有普遍性,这与传统经济学中假设“经济人”的预测相矛盾。这意味着某种机制必然在促进它们的涌现。然而,以往大多数解释往往依赖于基于模仿学习(一种简单形式的社会学习)的因素。在此,我们转向强化学习范式——个体通过评估积累经验带来的长期回报来更新策略。具体而言,我们采用Q-learning算法研究信任博弈,每个参与者关联两张不断演化的Q表格,分别指导其作为委托人和受托人的决策。在两人交互场景中,我们发现当个体同时重视历史经验与未来回报时,高水平的信任与可信赖性得以涌现。从机制上看,Q表格的演化呈现类似人类心理转变的交叉现象。我们还提供了博弈参数的相图,并进行了边界分析。当场景扩展至网格化种群时,这些发现依然稳健。因此,我们的结果为信任与可信赖性在无外部因素干预下的涌现提供了自然解释。更重要的是,所提出的范式展现了破解人类行为诸多谜题的潜力。