EASE Configuration Facilitates A Reproducible Science of LLM Social Simulations

LLMs are increasingly deployed to simulate social interactions, yet many of the existing simulators remain ad hoc and monolithic. This lack of architectural standardization prevents reproducible research and complicates downstream evaluation. We advance a rigorous science of LLM-based multi-agent simulation by modularizing core components into Environments, Agents, Simulation engines, and Evaluation metrics (EASE). We demonstrate the utility of EASE configuration by wrapping it in an experimental study schema for orchestrating workflows centered around answering explicit research questions in generated scenarios. We contribute SiliSocS, an open-source, research-ready Silicon Society Sandbox implementing a study-structured EASE configuration to enable highly configurable and reproducible LLM-based social simulations. Using SiliSocS and EASE, we present three case studies, showcasing the system's comprehensive assessment of existing questions, ability to dive deeper into complex questions, and elaboration of existing studies, respectively. Together, these case studies highlight the limitations of current modeling approaches and isolate the impacts of design choices on key results.

翻译：大语言模型越来越多地被用于模拟社会交互，然而许多现有模拟器仍然具有临时性和整体性。这种架构标准化缺失阻碍了可重复研究，并使下游评估复杂化。我们通过将核心组件模块化为环境、智能体、模拟引擎和评估指标（EASE），推进了基于LLM的多智能体模拟的严谨科学。我们通过将EASE配置封装于实验研究方案中，以编排围绕在生成场景中回答明确研究问题的工作流，展示了其效用。我们贡献了SiliSocS——一个开源、研究就绪的硅社会沙盒，实现了基于研究的EASE配置，从而支持高度可配置且可重复的基于LLM的社会模拟。利用SiliSocS与EASE，我们提出了三个案例研究，分别展示了系统对现有问题的全面评估、深入探索复杂问题的能力，以及对现有研究的扩充。这些案例研究共同揭示了当前建模方法的局限性，并隔离了设计选择对关键结果的影响。

相关内容

EASE

关注 0

软件工程评估（Evaluation and Assessment in Software Engineering，EASE）会议是一个国际领先的会议场所，学术界和实践者可以在此展示和讨论他们对基于证据的软件工程的研究及其对软件实践的影响。第23届EASE将于2019年4月在丹麦哥本哈根举行，由哥本哈根IT大学主办。EASE 2019欢迎向不同领域提交高质量的研究报告：完整的研究论文、短篇论文和手工艺品、新兴成果和愿景、行业轨迹、博士研讨会、海报。官网链接：https://ease2019.org/

LLMs与生成式智能体模拟：复杂系统研究的新范式

专知会员服务

28+阅读 · 2025年6月15日

迈向LLM时代的可泛化评估：超越基准的综述

专知会员服务

23+阅读 · 2025年4月29日

可信赖LLM智能体的研究综述：威胁与应对措施

专知会员服务

36+阅读 · 2025年3月17日

LLM4SR：关于大规模语言模型在科学研究中的应用综述

专知会员服务

42+阅读 · 2025年1月9日