Scientists and practitioners are aggressively moving to deploy digital twins - LLM-based models of real individuals - across social science and policy research. We conducted 19 pre-registered studies with 164 diverse outcomes (e.g., attitudes towards hiring algorithms, intention to share misinformation) and compared human responses to those of their digital twins (trained on each person's previous answers to over 500 questions). We find that digital twins' answers are only modestly more accurate than those from the homogeneous base LLM and correlate weakly with human responses (average r = 0.20). We document five ways in which digital twins distort human behavior: (i) stereotyping, (ii) insufficient individuation, (iii) representation bias, (iv) ideological biases, and (v) hyper-rationality. Together, our results caution against the premature deployment of digital twins, which may systematically misrepresent human cognition and undermine both scientific understanding and practical applications.
翻译:科学家和实践者正积极推动在社会科学与政策研究中部署数字孪生——基于大型语言模型的真实个体模型。我们通过19项预先注册的研究(涵盖164项多样化结果,如对招聘算法的态度、分享虚假信息的意愿),将人类反应与其数字孪生(基于每人先前对500多个问题的回答训练而成)的反应进行比较。研究发现,数字孪生的回答仅比同质化基础大型语言模型稍显准确,且与人类反应相关性较弱(平均r = 0.20)。我们记录了数字孪生扭曲人类行为的五种方式:(i)刻板印象化,(ii)个体化不足,(iii)表征偏差,(iv)意识形态偏见,以及(v)超理性化。综合而言,我们的研究结果警示数字孪生的过早部署可能系统性地歪曲人类认知,并损害科学理解与实际应用。