Human groups are able to converge on more accurate beliefs through deliberation, even in the presence of polarization and partisan bias -- a phenomenon known as the "wisdom of partisan crowds." Generated agents powered by Large Language Models (LLMs) are increasingly used to simulate human collective behavior, yet few benchmarks exist for evaluating their dynamics against the behavior of human groups. In this paper, we examine the extent to which the wisdom of partisan crowds emerges in groups of LLM-based agents that are prompted to role-play as partisan personas (e.g., Democrat or Republican). We find that they not only display human-like partisan biases, but also converge to more accurate beliefs through deliberation as humans do. We then identify several factors that interfere with convergence, including the use of chain-of-thought prompt and lack of details in personas. Conversely, fine-tuning on human data appears to enhance convergence. These findings show the potential and limitations of LLM-based agents as a model of human collective intelligence.
翻译:人类群体即使在存在两极分化和党派偏见的情况下,也能通过商议达成更准确的信念——这一现象被称为“党派性群体智慧”。由大型语言模型驱动的生成式智能体被越来越多地用于模拟人类集体行为,但针对其动态与人类群体行为的比较评估基准仍较为匮乏。本文研究了当基于LLM的智能体被提示扮演党派角色(例如民主党或共和党)时,党派性群体智慧在其中的涌现程度。我们发现,这些智能体不仅表现出类似人类的党派偏见,还能像人类一样通过商议收敛至更准确的信念。随后,我们识别出若干干扰收敛的因素,包括链式思维提示的使用以及角色细节的缺乏。相反,基于人类数据的微调似乎能增强收敛性。这些发现揭示了基于LLM的智能体作为人类集体智能模型的潜力与局限性。