Can classical consensus models predict the group behavior of large language models (LLMs)? We examine multi-round interactions among LLM agents through the DeGroot framework, where agents exchange text-based messages over diverse communication graphs. To track opinion evolution, we map each message to an opinion score via sentiment analysis. We find that agents typically reach consensus and the disagreement between the agents decays exponentially. However, the limiting opinion departs from DeGroot's network-centrality-weighted forecast. The consensus between LLM agents turns out to be largely insensitive to initial conditions and instead depends strongly on the discussion subject and inherent biases. Nevertheless, transient dynamics align with classical graph theory and the convergence rate of opinions is closely related to the second-largest eigenvalue of the graph's combination matrix. Together, these findings can be useful for LLM-driven social-network simulations and the design of resource-efficient multi-agent LLM applications.
翻译:经典共识模型能否预测大型语言模型(LLMs)的群体行为?我们通过DeGroot框架研究LLM智能体在多轮交互中的表现,智能体通过多样化通信图交换基于文本的信息。为追踪意见演化,我们通过情感分析将每条信息映射为意见分数。研究发现智能体通常能达成共识,且智能体间的分歧呈指数衰减。然而,极限意见会偏离DeGroot基于网络中心性权重的预测。LLM智能体间的共识对初始条件极不敏感,反而强烈依赖于讨论主题与内在偏见。尽管如此,暂态动力学仍符合经典图论规律,且意见收敛速率与图组合矩阵的第二大特征值密切相关。这些发现对LLM驱动的社交网络模拟及资源高效的多智能体LLM应用设计具有重要价值。