We administer a Turing Test to AI Chatbots. We examine how Chatbots behave in a suite of classic behavioral games that are designed to elicit characteristics such as trust, fairness, risk-aversion, cooperation, \textit{etc.}, as well as how they respond to a traditional Big-5 psychological survey that measures personality traits. ChatGPT-4 exhibits behavioral and personality traits that are statistically indistinguishable from a random human from tens of thousands of human subjects from more than 50 countries. Chatbots also modify their behavior based on previous experience and contexts ``as if'' they were learning from the interactions, and change their behavior in response to different framings of the same strategic situation. Their behaviors are often distinct from average and modal human behaviors, in which case they tend to behave on the more altruistic and cooperative end of the distribution. We estimate that they act as if they are maximizing an average of their own and partner's payoffs.
翻译:我们开展了一项面向AI聊天机器人的图灵测试。我们考察了聊天机器人在一系列旨在引发信任、公平、风险规避、合作等特征的经典行为游戏中的表现,以及它们对传统大五人格心理问卷调查的反应。ChatGPT-4的行为和人格特征在统计学上与来自50多个国家数万名人类受试者的随机个体无法区分。聊天机器人还会根据先前经验和情境调整自身行为,“仿佛”能从互动中学习,并对相同战略情境的不同框架做出行为改变。它们的行为通常与人类平均或常见行为存在差异,此时其行为倾向于更符合利他主义与合作倾向的分布区间。我们估算它们的行为模式如同在最大化自身与合作伙伴收益的平均值。