Large language models (LLMs) are increasingly deployed to support human decision-making. This use of LLMs has concerning implications, especially when their prescriptions affect the welfare of others. To gauge how LLMs make social decisions, we explore whether five leading models produce sensible strategies in the repeated prisoner's dilemma, which is the main metaphor of reciprocal cooperation. First, we measure the propensity of LLMs to cooperate in a neutral setting, without using language reminiscent of how this game is usually presented. We record to what extent LLMs implement Nash equilibria or other well-known strategy classes. Thereafter, we explore how LLMs adapt their strategies to changes in parameter values. We vary the game's continuation probability, the payoff values, and whether the total number of rounds is commonly known. We also study the effect of different framings. In each case, we test whether the adaptations of the LLMs are in line with basic intuition, theoretical predictions of evolutionary game theory, and experimental evidence from human participants. While all LLMs perform well in many of the tasks, none of them exhibit full consistency over all tasks. We also conduct tournaments between the inferred LLM strategies and study direct interaction between LLMs in games over ten rounds with a known or unknown last round. Our experiments shed light on how current LLMs instantiate reciprocal cooperation.
翻译:大型语言模型(LLMs)正日益被部署以支持人类决策。这种应用方式具有令人担忧的影响,尤其当其决策建议涉及他人福祉时。为探究LLMs如何制定社会决策,我们考察了五种主流模型在重复囚徒困境(互惠合作的核心隐喻)中是否产生合理策略。首先,我们在中性环境中测量LLMs的合作倾向,避免使用该博弈的常规表述语言。我们记录了LLMs实现纳什均衡或其他经典策略类别的程度。随后,我们探究LLMs如何根据参数值变化调整策略:改变博弈的持续概率、收益值、以及总回合数是否公开已知。同时研究不同表述框架的影响。每种情况下,我们都检验LLMs的调整是否符合基本直觉、演化博弈论的理论预测以及人类参与者的实验证据。虽然所有LLMs在多数任务中表现良好,但没有任何模型在所有任务中保持完全一致性。我们还通过推断的LLM策略进行锦标赛实验,并研究LLMs在已知或未知最终回合的十轮博弈中的直接交互。这些实验揭示了当前LLMs实现互惠合作的内在机制。