The Epi-LLM Framework: probing LLM behavioral priors through epidemiological agent-based models

Human behaviour during epidemics affects infectious disease dynamics, but quantifying this remains deeply challenging. Here we introduce the Epi-LLM framework: a novel integration of agent-based modelling, real-life epigames, and large language models (LLMs) in which a synthetic society of agents reasons and adapts dynamically over an outbreak contact network. Comparing synthetic agent behaviour against a no-intervention SEIR baseline and human participant data from the AUIB epigame study, we find that LLM agents across four different architectures reduced peak active infections, with quarantine compliance peaking at 58-65% on day six of the 15-day simulation. A binomial generalised linear model showed that perceived health severity was the strongest predictor of quarantine behaviour ($β= 0.33, p = 0.002$), yielding a pseudo-$R^2$ of 0.055, comparable to the 0.072 observed in the human trial. LLM architecture is a key determinant of epidemic dynamics: low-variance architectures offer greater internal validity for testing behavioural rules, while high-variance models may better represent real-world decision-making. Geographic labels alone do not induce culturally differentiated behaviour; explicit attitudinal parameterisation is required. This proof-of-principle work lays the groundwork for deploying the Epi-LLM framework as a scalable, risk-free simulation environment for pandemic preparedness research.

翻译：流行期间的人类行为会影响传染病动态，但量化该影响仍极具挑战性。本文提出Epi-LLM框架：一种融合基于主体建模、现实流行病游戏与大型语言模型（LLM）的创新方法，其中合成主体社会在暴发接触网络上进行动态推理与适应。通过将合成主体行为与无干预SEIR基线及AUIB流行病游戏研究中人类参与者数据进行对比，我们发现四种不同架构的LLM主体均减少了峰值活跃感染人数，其中第6天（共15天模拟）的隔离依从性达到58-65%的峰值。二项式广义线性模型显示，感知健康严重程度是隔离行为的最强预测因子（β=0.33, p=0.002），其伪R²值为0.055，与人体试验中观测到的0.072相当。LLM架构是流行病动态的关键决定因素：低方差架构在测试行为规则时提供更高的内部效度，而高方差模型可能更真实地反映现实决策。仅凭地域标签无法诱导文化分化行为，需显式参数化态度。这项原理验证工作为将Epi-LLM框架部署为可扩展、无风险的大流行准备研究模拟环境奠定了基础。

相关内容

大语言模型

关注 66

大语言模型是基于海量文本数据训练的深度学习模型。它不仅能够生成自然语言文本，还能够深入理解文本含义，处理各种自然语言任务，如文本摘要、问答、翻译等。2023年，大语言模型及其在人工智能领域的应用已成为全球科技研究的热点，其在规模上的增长尤为引人注目，参数量已从最初的十几亿跃升到如今的一万亿。参数量的提升使得模型能够更加精细地捕捉人类语言微妙之处，更加深入地理解人类语言的复杂性。在过去的一年里，大语言模型在吸纳新知识、分解复杂任务以及图文对齐等多方面都有显著提升。随着技术的不断成熟，它将不断拓展其应用范围，为人类提供更加智能化和个性化的服务，进一步改善人们的生活和生产方式。

从静态模板到动态运行时图：大语言模型智能体（LLM Agents）工作流优化综述

专知会员服务

23+阅读 · 3月30日

基于大语言模型（LLM）的智能体推理框架：从方法到场景的综述

专知会员服务

55+阅读 · 2025年8月26日

LLMs与生成式智能体模拟：复杂系统研究的新范式

专知会员服务

28+阅读 · 2025年6月15日

大型语言模型（LLM）智能体全栈安全的综述：数据、训练与部署

专知会员服务

33+阅读 · 2025年4月23日