Of the many commercial and scientific opportunities provided by large language models (LLMs; including Open AI's ChatGPT, Meta's LLaMA, and Anthropic's Claude), one of the more intriguing applications has been the simulation of human behavior and opinion. LLMs have been used to generate human simulcra to serve as experimental participants, survey respondents, or other independent agents, with outcomes that often closely parallel the observed behavior of their genuine human counterparts. Here, we specifically consider the feasibility of using LLMs to estimate subpopulation representative models (SRMs). SRMs could provide an alternate or complementary way to measure public opinion among demographic, geographic, or political segments of the population. However, the introduction of new technology to the socio-technical infrastructure does not come without risk. We provide an overview of behavior elicitation techniques for LLMs, and a survey of existing SRM implementations. We offer frameworks for the analysis, development, and practical implementation of LLMs as SRMs, consider potential risks, and suggest directions for future work.
翻译:在大型语言模型(LLMs;包括OpenAI的ChatGPT、Meta的LLaMA和Anthropic的Claude)所提供的众多商业与科学机遇中,最具吸引力的应用之一是对人类行为与意见的模拟。LLMs被用于生成人类模拟体,充当实验参与者、调查受访者或其他独立主体,其输出结果往往与真实人类受试者的可观测行为高度吻合。本文着重探讨利用LLMs构建子群体代表性模型(SRMs)的可行性。SRMs可为测量人口、地理或政治群体中的公众意见提供替代性或补充性方法。然而,将新技术引入社会技术基础设施并非毫无风险。我们综述了针对LLMs的行为激发技术及现有SRM实现方案,提出了将LLMs用作SRMs的分析框架、开发路径与实践方案,考察了潜在风险,并建议了未来研究方向。