Policymakers must often act under conditions of deep uncertainty, such as emergency response, where predicting the specific impacts of a policy apriori is implausible. Large Language Model (LLM) agent simulations have been proposed as tools to support policymakers under these conditions, yet little is known about how such simulations become useful for real-world policy practice. To address this gap, we conducted a year-long, stakeholder-engaged design process with a university emergency preparedness team. Through iterative design cycles, we developed and refined an LLM agent simulation of a large-scale campus gathering, ultimately scaling to 13,000 agents that modeled crowd movement and communication under various emergency scenarios. Rather than producing predictive forecasts, these simulations supported policy practice by shaping volunteer training, evacuation procedures, and infrastructure planning. Analyzing these findings, we identify three design process implications for making LLM agent simulations that are useful for policy practice: start from verifiable scenarios to bootstrap trust, use preliminary simulations to elicit tacit domain knowledge, and treat simulation capabilities and policy implementation as co-evolving.
翻译:政策制定者常常需要在深度不确定的条件下行动,例如应急响应,其中预先预测政策的具体影响是不可行的。大型语言模型(LLM)智能体模拟已被提议作为支持政策制定者应对这些条件的工具,但此类模拟如何对现实世界的政策实践变得有用却知之甚少。为填补这一空白,我们与一所大学的应急准备团队进行了为期一年的利益相关者参与式设计过程。通过迭代设计循环,我们开发并完善了一个大规模校园聚集活动的LLM智能体模拟,最终扩展到13,000个智能体,模拟了各种紧急情况下的群体移动和通信。这些模拟并非产生预测性预报,而是通过塑造志愿者培训、疏散程序和基础设施规划来支持政策实践。分析这些发现,我们提出了三个设计过程启示,以制作对政策实践有用的LLM智能体模拟:从可验证的场景入手以建立信任,使用初步模拟来引出隐性领域知识,并将模拟能力与政策实施视为共同演化的。