Behavioral experiments on the trust game have shown that trust and trustworthiness are universal among human beings, contradicting the prediction by assuming \emph{Homo economicus} in orthodox Economics. This means some mechanism must be at work that favors their emergence. Most previous explanations however need to resort to some factors based upon imitative learning, a simple version of social learning. Here, we turn to the paradigm of reinforcement learning, where individuals update their strategies by evaluating the long-term return through accumulated experience. Specifically, we investigate the trust game with the Q-learning algorithm, where each participant is associated with two evolving Q-tables that guide one's decision making as trustor and trustee respectively. In the pairwise scenario, we reveal that high levels of trust and trustworthiness emerge when individuals appreciate both their historical experience and returns in the future. Mechanistically, the evolution of the Q-tables shows a crossover that resembles human's psychological changes. We also provide the phase diagram for the game parameters, where the boundary analysis is conducted. These findings are robust when the scenario is extended to a latticed population. Our results thus provide a natural explanation for the emergence of trust and trustworthiness without external factors involved. More importantly, the proposed paradigm shows the potential in deciphering many puzzles in human behaviors.
翻译:关于信任博弈的行为实验表明,信任与可信赖性在人类中具有普遍性,这与正统经济学中假设"理性经济人"的预测相矛盾。这意味着必然存在某种机制促进了它们的产生。然而,以往的大多数解释都需要借助基于模仿学习(社会学习的一种简单形式)的某些因素。在此,我们转向强化学习范式——个体通过评估累积经验获得的长期回报来更新自身策略。具体而言,我们采用Q学习算法研究信任博弈,每个参与者关联两个随演化而变化的Q表,分别指导其作为委托人和受托人的决策。在成对交互场景中,我们发现当个体同时重视历史经验与未来回报时,会涌现出高水平的信任与可信赖性。从机制上看,Q表的演化呈现类似人类心理变化的交叉特征。我们还给出了博弈参数的相图,并进行了边界分析。当场景扩展至网格种群时,这些结论依然保持稳健。因此,我们的研究为信任与可信赖性的自然涌现提供了无需外部因素介入的解释。更重要的是,所提出的范式展现了破解人类行为诸多谜题的潜力。