Large language models (LLMs) are capable of producing high quality information at unprecedented rates. As these models continue to entrench themselves in society, the content they produce will become increasingly pervasive in databases that are, in turn, incorporated into the pre-training data, fine-tuning data, retrieval data, etc. of other language models. In this paper we formalize the idea of a communication network of LLMs and introduce a method for representing the perspective of individual models within a collection of LLMs. Given these tools we systematically study information diffusion in the communication network of LLMs in various simulated settings.
翻译:大型语言模型(LLMs)能够以前所未有的速度生成高质量信息。随着这些模型在社会中不断深入应用,其生成的内容将在各类数据库中日益普及,而这些数据库又会被整合到其他语言模型的预训练数据、微调数据、检索数据等中。本文形式化描述了LLMs通信网络的概念,并提出了一种在LLMs集合中表征个体模型视角的方法。基于这些工具,我们在多种模拟场景下系统研究了LLMs通信网络中的信息传播机制。