Comparing Data Assimilation and Likelihood-Based Inference on Latent State Estimation in Agent-Based Models

In this paper, we present the first systematic comparison of Data Assimilation (DA) and Likelihood-Based Inference (LBI) in the context of an Agent-Based Model (ABM). These models generate observable time series driven by evolving, partially-latent microstates. Latent states must be estimated to align simulations with real-world data, a task traditionally addressed by DA, particularly in continuous and equation-based models used in weather forecasting. However, the nature of ABMs poses challenges for standard DA methods. Solving such issues requires adapting previous DA techniques or using ad hoc alternatives such as LBI. DA approximates the likelihood in a model-agnostic way, making it broadly applicable but potentially less precise. In contrast, LBI provides more accurate state estimation by directly leveraging the model's likelihood, but at the cost of requiring a hand-crafted, model-specific likelihood function, which may be complex or infeasible to derive. We compare the two methods on the Bounded-Confidence Model, a well-known opinion dynamics ABM, where agents are affected only by others holding sufficiently similar opinions. We find that LBI better recovers latent agent-level opinions, even under model mis-specification, leading to improved individual-level forecasts. At the aggregate level, however, both methods perform comparably, and DA remains competitive across levels of aggregation under certain parameter settings. Our findings suggest that DA is well-suited for aggregate predictions, while LBI is preferable for agent-level inference.

翻译：本文首次系统性地比较了数据同化（Data Assimilation, DA）与基于似然的推断（Likelihood-Based Inference, LBI）在基于智能体的模型（Agent-Based Model, ABM）中的应用。这类模型通过演化中的部分可观测微观状态生成可观测的时间序列。为了将模拟结果与现实数据对齐，必须对潜状态进行估计，而这一任务传统上由数据同化方法处理，尤其在天气预报等连续型与基于方程的模型中广泛应用。然而，基于智能体模型的特性为标准数据同化方法带来了挑战。解决这些问题需要改进现有数据同化技术，或采用专门设计的替代方法（如基于似然的推断）。数据同化以与模型无关的方式近似似然函数，使其具有广泛适用性但可能精度较低。相比之下，基于似然的推断通过直接利用模型似然函数提供更精确的状态估计，但其代价是需要人工定制与模型相关的似然函数，这可能在推导过程中复杂甚至不可行。我们在边界信任模型（Bounded-Confidence Model）——一种知名的观点动力学基于智能体模型——中对这两种方法进行了比较，其中智能体仅受持有足够相似观点的其他智能体影响。研究发现，即便在模型设定有误的情况下，基于似然的推断仍能更准确地恢复智能体层面的潜状态，从而提升个体层面的预测能力。然而在聚合层面，两种方法表现相当，且在特定参数设置下，数据同化在跨聚合层级仍具有竞争力。结果表明，数据同化适用于聚合预测，而基于似然的推断更适用于智能体层面的推理。