The rise of large language models (LLMs) that generate human-like text has sparked debates over their potential to replace human participants in behavioral and cognitive research. We critically evaluate this replacement perspective to appraise the fundamental utility of language models in psychology and social science. Through a five-dimension framework, characterization, representation, interpretation, implication, and utility, we identify six fallacies that undermine the replacement perspective: (1) equating token prediction with human intelligence, (2) assuming LLMs represent the average human, (3) interpreting alignment as explanation, (4) anthropomorphizing AI, (5) essentializing identities, and (6) purporting LLMs as primary tools that directly reveal the human mind. Rather than replacement, the evidence and arguments are consistent with a simulation perspective, where LLMs offer a new paradigm to simulate roles and model cognitive processes. We highlight limitations and considerations about internal, external, construct, and statistical validity, providing methodological guidelines for effective integration of LLMs into psychological research, with a focus on model selection, prompt design, interpretation, and ethical considerations. This perspective reframes the role of language models in behavioral and cognitive science, serving as linguistic simulators and cognitive models that shed light on the similarities and differences between machine intelligence and human cognition and thoughts.
翻译:生成类人文本的大型语言模型(LLMs)的兴起,引发了关于其能否取代行为与认知研究中人类参与者的争论。我们通过批判性评估这一替代视角,以审视语言模型在心理学与社会科学中的根本效用。基于表征、再现、解释、启示及效用这五个维度,我们识别出削弱替代视角的六个谬误:(1)将词元预测等同于人类智能,(2)假定LLMs代表普通人类,(3)将对齐解释为因果说明,(4)对人工智能进行拟人化,(5)对身份进行本质化,以及(6)宣称LLMs是直接揭示人类心智的主要工具。证据与论证更支持一种模拟视角而非替代视角,即LLMs为模拟角色和建模认知过程提供了新范式。我们着重探讨了内部效度、外部效度、结构效度及统计效度方面的局限性与注意事项,为LLMs有效融入心理学研究提供了方法论指导,重点关注模型选择、提示设计、结果解释及伦理考量。这一视角重新界定了语言模型在行为与认知科学中的角色——作为语言模拟器与认知模型,它们有助于阐明机器智能与人类认知及思维之间的异同。