How collective behaviors emerge from the interactions of individual LLM-driven agents is a central question in artificial life, yet controlled study of these emergent dynamics has been hindered by the lack of a principled simulation framework for systematic experimentation. To address this, we introduce Shachi, a principled methodology and modular framework that decomposes an agent's cognition into core components: Configuration for intrinsic identity, Memory for contextual continuity, and Tools for extended capabilities, all orchestrated by an LLM reasoning engine. This decomposition treats each cognitive component as an independently controllable variable, enabling perturbation studies that trace how micro-level cognitive traits propagate into population-level dynamics. We investigate behavioral patterns across a 10-task benchmark spanning three levels of collective complexity. Shachi enables memory transfer across environment transitions, producing history-dependent behavioral shifts, and allows agents to simultaneously inhabit multiple environments, revealing cross-environment interference invisible in single-environment studies. Furthermore, in a real-world U.S. tariff shock case study, locally interacting agents with individually controlled cognitive components produce macro-level market dynamics directionally consistent with observed real-world outcomes. Our work provides a rigorous, open-source simulation framework for LLM-based ABM, aimed at fostering cumulative scientific inquiry into the emergent collective behaviors of interacting artificial agents.
翻译:集体行为如何从由大语言模型驱动的个体智能体互动中涌现,是人工生命领域的核心问题。然而,由于缺乏用于系统性实验的规范化仿真框架,对这些涌现动力学进行受控研究一直受阻。为解决这一问题,我们提出Shachi——一种规范性的方法论与模块化框架,它将智能体的认知过程分解为核心组件:用于内在身份构建的配置(Configuration)、用于上下文连续性的记忆(Memory)、以及用于扩展能力的工具(Tools),所有组件均由LLM推理引擎协调。这种分解将每个认知组件视为独立可控变量,从而能够开展扰动研究,追踪微观认知特征如何传播至种群层面的动力学。我们通过十个任务基准(涵盖三个集体复杂性层级)研究行为模式。Shachi支持跨环境迁移时的记忆传递,产生具有历史依赖性的行为转变,并允许智能体同时驻留于多个环境,从而揭示单环境研究中不可见的跨环境干扰。此外,在真实世界的美国关税冲击案例研究中,具有独立可控认知组件的局部交互智能体,其产生的宏观市场动态与观测到的真实结果方向一致。本研究为基于LLM的智能体建模(ABM)提供了一个严谨的开源仿真框架,旨在促进对交互式人工智能体涌现集体行为的累积性科学研究。