Surveys are widely used in social sciences to understand human behavior, but their implementation often involves iterative adjustments that demand significant effort and resources. To this end, researchers have increasingly turned to large language models (LLMs) to simulate human behavior. While existing studies have focused on distributional similarities, individual-level comparisons remain underexplored. Building upon prior work, we investigate whether providing LLMs with respondents' prior information can replicate both statistical distributions and individual decision-making patterns using Partial Least Squares Structural Equation Modeling (PLS-SEM), a well-established causal analysis method. We also introduce the concept of the LLM-Twin, user personas generated by supplying respondent-specific information to the LLM. By comparing responses generated by the LLM-Twin with actual individual survey responses, we assess its effectiveness in replicating individual-level outcomes. Our findings show that: (1) PLS-SEM analysis shows LLM-generated responses align with human responses, (2) LLMs, when provided with respondent-specific information, are capable of reproducing individual human responses, and (3) LLM-Twin responses closely follow human responses at the individual level. These findings highlight the potential of LLMs as a complementary tool for pre-testing surveys and optimizing research design.
翻译:调查在社会科学中被广泛用于理解人类行为,但其实施通常涉及需要大量精力和资源的迭代调整。为此,研究者们越来越多地转向使用大语言模型来模拟人类行为。虽然现有研究主要关注分布相似性,但个体层面的比较仍未得到充分探索。基于先前工作,我们研究了向大语言模型提供受访者先验信息,能否使用偏最小二乘结构方程模型这一成熟的因果分析方法,来复现统计分布和个体决策模式。我们还引入了LLM-Twin的概念,即通过向大语言模型提供受访者特定信息而生成的用户角色。通过比较LLM-Twin生成的回答与实际个体调查回答,我们评估了其在复现个体层面结果方面的有效性。我们的研究结果表明:(1) PLS-SEM分析显示大语言模型生成的回答与人类回答一致;(2) 当提供受访者特定信息时,大语言模型能够复现个体人类回答;(3) LLM-Twin的回答在个体层面紧密跟随人类回答。这些发现突显了大语言模型作为调查预测试和优化研究设计的补充工具的潜力。