Large Language Model (LLM)-based multi-agent systems are increasingly used to simulate human interactions and solve collaborative tasks. A common practice is to assign agents with personas to encourage behavioral diversity. However, this raises a critical yet underexplored question: do personas introduce biases into multi-agent interactions? This paper presents a systematic investigation into persona-induced biases in multi-agent interactions, with a focus on social traits like trustworthiness (how an agent's opinion is received by others) and insistence (how strongly an agent advocates for its opinion). Through a series of controlled experiments in collaborative problem-solving and persuasion tasks, we reveal that (1) LLM-based agents exhibit biases in both trustworthiness and insistence, with personas from historically advantaged groups (e.g., men and White individuals) perceived as less trustworthy and demonstrating less insistence; and (2) agents exhibit significant in-group favoritism, showing a higher tendency to conform to others who share the same persona. These biases persist across various LLMs, group sizes, and numbers of interaction rounds, highlighting an urgent need for awareness and mitigation to ensure the fairness and reliability of multi-agent systems.
翻译:基于大语言模型(LLM)的多智能体系统正日益用于模拟人类交互并解决协作任务。一种常见做法是为智能体分配角色以促进行为多样性。然而,这引发了一个关键但尚未充分探讨的问题:角色是否会在多智能体交互中引入偏见?本文系统性地研究了多智能体交互中角色诱导的偏见,重点关注社会性特质,如可信度(智能体的观点如何被他人接受)和坚持度(智能体主张自身观点的强度)。通过在协作问题解决和说服任务中进行一系列受控实验,我们发现:(1)基于LLM的智能体在可信度和坚持度方面均表现出偏见,来自历史上优势群体(例如男性和白人个体)的角色被认为可信度较低且坚持度较弱;(2)智能体表现出显著的群内偏爱,更倾向于与共享相同角色的其他智能体保持一致。这些偏见在不同LLM、群体规模和交互轮次中持续存在,凸显了提高意识并采取缓解措施以确保多智能体系统公平性与可靠性的迫切需求。