This paper introduces CoBRA, a novel toolkit for systematically specifying agent behavior in LLM-based social simulation. We found that conventional approaches that specify agent behavior through implicit natural-language descriptions often do not yield consistent behavior across models, and the resulting behavior does not capture the nuances of the descriptions. In contrast, CoBRA introduces a model-agnostic way to control agent behavior that lets researchers explicitly specify desired nuances and obtain consistent behavior across models. At the heart of CoBRA is a novel closed-loop system primitive with two components: (1) Cognitive Bias Index that measures the demonstrated cognitive bias of a social agent, by quantifying the agent's reactions in a set of validated classic social science experiments; (2) Behavioral Regulation Engine that aligns the agent's behavior to exhibit controlled cognitive bias. Through CoBRA, we show how to operationalize validated social science knowledge (i.e., classical experiments) as reusable "gym" environments for AI -- an approach that may generalize to richer social and affective simulations beyond bias alone.
翻译:本文介绍CoBRA,一种用于在基于大语言模型的社会模拟中系统化指定智能体行为的新型工具包。我们发现,通过隐式自然语言描述来指定智能体行为的传统方法通常无法在不同模型间产生一致行为,且生成的行为难以捕捉描述的细微差别。相比之下,CoBRA提出了一种模型无关的智能体行为控制方法,使研究者能够显式指定所需的细微特征,并在不同模型间获得一致行为。CoBRA的核心是一个创新的闭环系统原语,包含两个组件:(1) 认知偏差指数:通过量化智能体在一组经过验证的经典社会科学实验中的反应,测量社交智能体表现出的认知偏差;(2) 行为调节引擎:调整智能体行为以呈现受控的认知偏差。通过CoBRA,我们展示了如何将经过验证的社会科学知识(即经典实验)转化为可复用的AI“训练场”环境——这种方法可推广到超越单一偏差、更丰富的社会与情感模拟领域。