Cognitive biases often shape human decisions. While large language models (LLMs) have been shown to reproduce well-known biases, a more critical question is whether LLMs can predict biases at the individual level and emulate the dynamics of biased human behavior when contextual factors, such as cognitive load, interact with these biases. We adapted three well-established decision scenarios into a conversational setting and conducted a human experiment (N=1100). Participants engaged with a chatbot that facilitates decision-making through simple or complex dialogues. Results revealed robust biases. To evaluate how LLMs emulate human decision-making under similar interactive conditions, we used participant demographics and dialogue transcripts to simulate these conditions with LLMs based on GPT-4 and GPT-5. The LLMs reproduced human biases with precision. We found notable differences between models in how they aligned human behavior. This has important implications for designing and evaluating adaptive, bias-aware LLM-based AI systems in interactive contexts.
翻译:认知偏差常常影响人类决策。尽管已有研究表明大型语言模型(LLM)能够复现已知的认知偏差,但更关键的问题在于LLM能否在个体层面预测偏差,并在认知负荷等情境因素与偏差相互作用时,模拟人类偏差行为的动态过程。我们将三个经典决策场景转化为对话形式,并开展了一项人类实验(N=1100)。参与者通过与促进决策的聊天机器人进行简单或复杂的对话来完成实验。结果显示出显著的认知偏差。为评估LLM在类似交互条件下对人类决策的模拟能力,我们基于GPT-4和GPT-5模型,利用参与者人口统计数据和对话记录来模拟这些条件。LLM精准地复现了人类认知偏差。我们发现不同模型在模拟人类行为的一致性方面存在显著差异。这一发现对于在交互场景中设计和评估具备偏差感知能力的自适应LLM人工智能系统具有重要启示。