The Role of Social Learning and Collective Norm Formation in Fostering Cooperation in LLM Multi-Agent Systems

A growing body of multi-agent studies with LLMs explores how norms and cooperation emerge in mixed-motive scenarios, where pursuing individual gain can undermine the collective good. While prior work has explored these dynamics in both richly contextualized simulations and simplified game-theoretic environments, most LLM systems featuring common-pool resource (CPR) games provide agents with explicit reward functions directly tied to their actions. In contrast, human cooperation often emerges without explicit knowledge of the payoff structure or how individual actions translate into long-run outcomes, relying instead on heuristics, communication, and enforcement. We introduce a CPR simulation framework that removes explicit reward signals and embeds cultural-evolutionary mechanisms: social learning (adopting strategies and beliefs from successful peers) and norm-based punishment, grounded in Ostrom's principles of resource governance. Agents also individually learn from the consequences of harvesting, monitoring, and punishing via environmental feedback, enabling norms to emerge endogenously. We establish the validity of our simulation by reproducing key findings from existing studies on human behavior. Building on this, we examine norm evolution across a $2\times2$ grid of environmental and social initialisations (resource-rich vs. resource-scarce; altruistic vs. selfish) and benchmark how agentic societies comprised of different LLMs perform under these conditions. Our results reveal systematic model differences in sustaining cooperation and norm formation, positioning the framework as a rigorous testbed for studying emergent norms in mixed-motive LLM societies. Such analysis can inform the design of AI systems deployed in social and organizational contexts, where alignment with cooperative norms is critical for stability, fairness, and effective governance of AI-mediated environments.

翻译：越来越多的LLM多智能体研究探讨了在混合动机场景中规范与合作如何形成——在这种场景中，追求个人利益可能会损害集体利益。尽管先前的研究已在丰富情境化的模拟和简化的博弈论环境中探索了这些动态，但大多数涉及公共资源博弈的LLM系统都为智能体提供了与其行动直接挂钩的显式奖励函数。相比之下，人类的合作往往在没有明确了解收益结构或个体行动如何转化为长期结果的情况下产生，而是依赖于启发式方法、沟通与执行机制。我们引入了一个公共资源模拟框架，该框架移除了显式奖励信号，并嵌入了文化演化机制：基于奥斯特罗姆资源治理原则的社会学习（从成功同伴处采纳策略与信念）与基于规范的惩罚。智能体还通过环境反馈从收获、监测和惩罚的后果中进行个体学习，从而使规范能够内生地涌现。我们通过复现现有关于人类行为研究的关键发现，验证了模拟的有效性。在此基础上，我们考察了环境与社会初始化条件（资源丰富 vs. 资源稀缺；利他 vs. 自私）在 $2\times2$ 网格上的规范演化，并基准测试了由不同LLM构成的智能体社会在这些条件下的表现。我们的结果揭示了在维持合作与规范形成方面存在的系统性模型差异，从而将该框架定位为研究混合动机LLM社会中涌现规范的严格测试平台。此类分析可为部署于社会与组织情境的AI系统设计提供参考，在这些情境中，与协作规范的协调对于AI中介环境的稳定性、公平性与有效治理至关重要。