Alignment is a social phenomenon wherein individuals share a common goal or perspective. Mirroring, or mimicking the behaviors and opinions of another individual, is one mechanism by which individuals can become aligned. Large scale investigations of the effect of mirroring on alignment have been limited due to the scalability of traditional experimental designs in sociology. In this paper, we introduce a simple computational framework that enables studying the effect of mirroring behavior on alignment in multi-agent systems. We simulate systems of interacting large language models in this framework and characterize overall system behavior and alignment with quantitative measures of agent dynamics. We find that system behavior is strongly influenced by the range of communication of each agent and that these effects are exacerbated by increased rates of mirroring. We discuss the observed simulated system behavior in the context of known human social dynamics.
翻译:对齐是一种社会现象,指个体间共享共同目标或观点。镜像(即模仿其他个体的行为与观点)是实现个体对齐的机制之一。受限于社会学传统实验设计的可扩展性,关于镜像效应对齐作用的大规模研究一直较为有限。本文提出一个简单的计算框架,用于研究多智能体系统中镜像行为对对齐的影响。在此框架下,我们模拟了交互式大语言模型系统,并通过智能体动态的量化指标来刻画系统整体行为与对齐状态。研究发现:系统行为受各智能体通信范围的显著影响,且这种影响会随着镜像速率的提升而加剧。我们结合已知的人类社会动态,对观测到的模拟系统行为进行了讨论。