When Identity Overrides Incentives: Representational Choices as Governance Decisions in Multi-Agent LLM Systems

Multi-agent systems built on large language models are increasingly deployed in strategic policy and governance settings, where agents representing stakeholders with conflicting interests must coordinate under shared constraints. These systems typically assign role-based personas to agents, describing their motivations and objectives. Whether agents with role-based identities follow explicit payoffs or their assigned roles in strategic decision-making remains untested. Here we show that assigning role-based personas suppresses payoff-aligned behavior in four-agent strategic games, shifting equilibrium attainment by up to 90 percentage points even when agents have complete payoff information. We test a 2x2 factorial design (persona presence x payoff visibility) across four models (Qwen-7B, Qwen-32B, Llama-8B, Mistral-7B), and 53 environmental policy scenarios with two equilibria: Tragedy of the Commons, where individual payoff dominates, and Green Transition, where collective payoff dominates. With personas present, all models reach near-zero Tragedy equilibrium in the Tragedy-dominant scenarios despite complete payoff information, and 100% of equilibria correspond to Green Transition. No model reaches Tragedy equilibrium by removing personas alone; only Qwen models reach 65-90% Tragedy equilibrium rates when personas are removed, and payoffs are made explicit. Three distinct behavioral profiles emerge: Qwen shifts equilibrium selection based on framing condition, Mistral increases response variance without reaching the Tragedy equilibrium, and Llama holds near-constant across all conditions. Representational choices in multi-agent LLM systems are governance decisions: persona assignment determines which equilibrium a simulation produces, independent of the underlying incentive structure.

翻译：基于大语言模型构建的多智能体系统正日益应用于战略决策与治理场景，在此类场景中，代表不同利益冲突方的智能体需在共享约束下进行协调。这些系统通常为智能体分配基于角色的身份来刻画其动机与目标。基于角色身份的智能体在战略决策中究竟遵循显性收益还是其分配角色，这一问题尚未得到验证。本文证明，在四人博弈中，分配基于角色的身份会抑制遵循收益的行为，即使智能体掌握完全收益信息，均衡达成率仍可偏移高达90个百分点。我们在四个模型（Qwen-7B、Qwen-32B、Llama-8B、Mistral-7B）及53种环境政策场景（包含两种均衡：个体收益主导的"公地悲剧"均衡与集体收益主导的"绿色转型"均衡）中测试了2×2析因设计（身份存在性×收益可见性）。在存在身份的情况下，尽管具有完全收益信息，所有模型在公地悲剧主导场景中均达到接近零的公地悲剧均衡，且100%的均衡对应绿色转型。仅移除身份并不能使任何模型达到公地悲剧均衡；仅在移除身份且收益显性化后，Qwen模型才达到65-90%的公地悲剧均衡率。研究发现三种不同的行为特征：Qwen根据框架条件转变均衡选择，Mistral在未达到公地悲剧均衡的情况下增加响应方差，而Llama在所有条件下保持近恒定行为。多智能体大语言模型系统中的表征选择本质上是治理决策：身份分配决定了仿真产生的均衡类型，且独立于底层激励结构。