When LLMs Play the Telephone Game: Cultural Attractors as Conceptual Tools to Evaluate LLMs in Multi-turn Settings

from arxiv, Code available at https://github.com/jeremyperez2/TelephoneGameLLM. Companion website with a Data Explorer tool at https://sites.google.com/view/telephone-game-llm . This paper was published at the 2025 International Conference on Learning Representations (ICLR2025) https://iclr.cc/virtual/2025/poster/28880

As large language models (LLMs) start interacting with each other and generating an increasing amount of text online, it becomes crucial to better understand how information is transformed as it passes from one LLM to the next. While significant research has examined individual LLM behaviors, existing studies have largely overlooked the collective behaviors and information distortions arising from iterated LLM interactions. Small biases, negligible at the single output level, risk being amplified in iterated interactions, potentially leading the content to evolve towards attractor states. In a series of telephone game experiments, we apply a transmission chain design borrowed from the human cultural evolution literature: LLM agents iteratively receive, produce, and transmit texts from the previous to the next agent in the chain. By tracking the evolution of text toxicity, positivity, difficulty, and length across transmission chains, we uncover the existence of biases and attractors, and study their dependence on the initial text, the instructions, language model, and model size. For instance, we find that more open-ended instructions lead to stronger attraction effects compared to more constrained tasks. We also find that different text properties display different sensitivity to attraction effects, with toxicity leading to stronger attractors than length. These findings highlight the importance of accounting for multi-step transmission dynamics and represent a first step towards a more comprehensive understanding of LLM cultural dynamics.

翻译：随着大语言模型（LLMs）开始相互交互并在线上生成日益增多的文本，更好地理解信息在LLM间传递时如何发生变形变得至关重要。尽管已有大量研究考察了单个LLM的行为，但现有工作大多忽视了迭代式LLM交互中产生的集体行为与信息失真。在单次输出层面微不足道的微小偏差，在迭代交互中可能被放大，导致内容向吸引子状态演化。在一系列传话游戏实验中，我们借鉴了人类文化演化研究中的传递链设计：LLM智能体在链式结构中迭代地接收、生成并向前传递文本。通过追踪文本毒性、积极性、难度和长度在传递链中的演化，我们揭示了偏差与吸引子的存在，并研究了它们对初始文本、指令、语言模型及模型规模的依赖性。例如，我们发现相较于约束性更强的任务，开放性更强的指令会导致更显著的吸引效应。我们还发现不同文本属性对吸引效应的敏感性存在差异，毒性比长度表现出更强的吸引子特性。这些发现凸显了考虑多步传递动力学的重要性，并为更全面理解LLM文化动态迈出了第一步。