Large language model (LLM)-based multi-agent systems enable expressive agent reasoning but are expensive to scale and poorly calibrated for timestep-aligned state-transition simulation, while classical agent-based models (ABMs) offer interpretability but struggle to integrate rich individual-level signals and non-stationary behaviors. We propose PhysicsAgentABM, which shifts inference to behaviorally coherent agent clusters: state-specialized symbolic agents encode mechanistic transition priors, a multimodal neural transition model captures temporal and interaction dynamics, and uncertainty-aware epistemic fusion yields calibrated cluster-level transition distributions. Individual agents then stochastically realize transitions under local constraints, decoupling population inference from entity-level variability. We further introduce ANCHOR, an LLM agent-driven clustering strategy based on cross-contextual behavioral responses and a novel contrastive loss, reducing LLM calls by up to 6-8 times. Experiments across public health, finance, and social sciences show consistent gains in event-time accuracy and calibration over mechanistic, neural, and LLM baselines. By re-architecting generative ABM around population-level inference with uncertainty-aware neuro-symbolic fusion, PhysicsAgentABM establishes a new paradigm for scalable and calibrated simulation with LLMs.
翻译:基于大语言模型(LLM)的多智能体系统能够实现富有表现力的智能体推理,但扩展成本高昂,且难以针对时间步对齐的状态转移仿真进行良好校准;而经典的基于智能体的模型(ABM)虽具有可解释性,却难以整合丰富的个体层面信号与非稳态行为。我们提出 PhysicsAgentABM,该方法将推理转移至行为一致的智能体集群:状态特化的符号智能体编码机制性转移先验,一个多模态神经转移模型捕捉时序与交互动态,而具备不确定性感知的认知融合则产生经过校准的集群层面转移分布。随后,个体智能体在局部约束下随机实现状态转移,从而将群体推理与实体层面的变异性解耦。我们进一步提出 ANCHOR,一种基于跨情境行为响应与新型对比损失的 LLM 智能体驱动聚类策略,可将 LLM 调用次数减少高达 6-8 倍。在公共卫生、金融和社会科学领域的实验表明,相较于机制模型、神经模型及 LLM 基线方法,本方法在事件时间准确性与校准性方面均取得了一致的提升。通过围绕群体层面推理与具备不确定性感知的神经符号融合重构生成式 ABM,PhysicsAgentABM 为利用 LLM 进行可扩展且可校准的仿真建立了一种新范式。