The use of Large Language Models (LLMs) for simulating human perspectives via persona prompting is gaining traction in computational social science. However, well-curated, empirically grounded persona collections remain scarce, limiting the accuracy and representativeness of such simulations. Here, we introduce the German General Social Survey Personas (GGSS Personas) collection, a comprehensive and representative persona prompt collection built from the German General Social Survey (ALLBUS). The GGSS Personas and their persona prompts are designed to be easily plugged into prompts for all types of LLMs and tasks, steering models to generate responses aligned with the underlying German population. We evaluate GGSS Personas by prompting various LLMs to simulate survey response distributions across diverse topics, demonstrating that GGSS Personas-guided LLMs outperform state-of-the-art classifiers, particularly under data scarcity. Furthermore, we analyze how the representativity and attribute selection within persona prompts affect alignment with population responses. Our findings suggest that GGSS Personas provide a potentially valuable resource for research on LLM-based social simulations that enables more systematic explorations of population-aligned persona prompting in NLP and social science research.
翻译:通过人物角色提示利用大语言模型模拟人类视角的做法,在计算社会科学领域日益受到关注。然而,经过精心策划、基于实证的人物角色集合仍然稀缺,限制了此类模拟的准确性和代表性。本文介绍德国综合社会调查人物角色集合,这是一个基于德国综合社会调查构建的全面且具有代表性的人物角色提示集合。GGSS人物角色及其提示设计旨在轻松嵌入各类大语言模型和任务的提示中,引导模型生成与德国基础人口特征对齐的响应。我们通过提示不同大语言模型模拟多主题调查响应分布来评估GGSS人物角色,结果表明GGSS人物角色引导的大语言模型优于最先进的分类器,尤其在数据稀缺条件下表现突出。此外,我们分析了人物角色提示中的代表性特征与属性选择如何影响与人口响应的对齐程度。研究结果表明,GGSS人物角色为基于大语言模型的社会模拟研究提供了潜在宝贵资源,使得在自然语言处理和社会科学研究中能够更系统地探索人口对齐的人物角色提示方法。