In recent research, large language models (LLMs) have been increasingly used to investigate public opinions. This study investigates the algorithmic fidelity of LLMs, i.e., the ability to replicate the socio-cultural context and nuanced opinions of human participants. Using open-ended survey data from the German Longitudinal Election Studies (GLES), we prompt different LLMs to generate synthetic public opinions reflective of German subpopulations by incorporating demographic features into the persona prompts. Our results show that Llama performs better than other LLMs at representing subpopulations, particularly when there is lower opinion diversity within those groups. Our findings further reveal that the LLM performs better for supporters of left-leaning parties like The Greens and The Left compared to other parties, and matches the least with the right-party AfD. Additionally, the inclusion or exclusion of specific variables in the prompts can significantly impact the models' predictions. These findings underscore the importance of aligning LLMs to more effectively model diverse public opinions while minimizing political biases and enhancing robustness in representativeness.
翻译:近年来,大型语言模型(LLMs)在公众意见研究中的应用日益广泛。本研究探讨了LLMs的算法保真度,即其复现人类参与者社会文化背景与细微意见的能力。基于德国纵向选举研究(GLES)的开放式调查数据,我们通过将人口统计特征融入角色提示词,引导不同LLMs生成反映德国亚群体特征的合成公众意见。结果显示,Llama在表征亚群体方面优于其他LLMs,尤其在群体内部意见多样性较低时表现更佳。研究进一步发现,相较于其他政党,LLMs对左翼政党(如绿党和左翼党)支持者的建模效果更好,而与右翼政党德国选择党(AfD)的匹配度最低。此外,提示词中是否包含特定变量会显著影响模型的预测结果。这些发现凸显了调整LLMs以更有效建模多元公众意见的重要性,同时需最大限度减少政治偏见并增强表征稳健性。