Computational models have become a powerful tool in the quantitative sciences to understand the behaviour of complex systems that evolve in time. However, they often contain a potentially large number of free parameters whose values cannot be obtained from theory but need to be inferred from data. This is especially the case for models in the social sciences, economics, or computational epidemiology. Yet many current parameter estimation methods are mathematically involved and computationally slow to run. In this paper we present a computationally simple and fast method to retrieve accurate probability densities for model parameters using neural differential equations. We present a pipeline comprising multi-agent models acting as forward solvers for systems of ordinary or stochastic differential equations, and a neural network to then extract parameters from the data generated by the model. The two combined create a powerful tool that can quickly estimate densities on model parameters, even for very large systems. We demonstrate the method on synthetic time series data of the SIR model of the spread of infection, and perform an in-depth analysis of the Harris-Wilson model of economic activity on a network, representing a non-convex problem. For the latter, we apply our method both to synthetic data and to data of economic activity across Greater London. We find that our method calibrates the model orders of magnitude more accurately than a previous study of the same dataset using classical techniques, while running between 195 and 390 times faster.
翻译:计算模型已成为定量科学中理解复杂系统随时间演化行为的强有力工具。然而,这些模型通常包含大量自由参数,其数值无法从理论推导获得,而需要通过数据推断。社会科学、经济学或计算流行病学中的模型尤为如此。当前多数参数估计方法在数学上较为复杂且计算缓慢。本文提出一种计算简单且快速的方法,利用神经常微分方程恢复模型参数的精确概率密度。我们构建了一个包含多智能体模型(作为常微分或随机微分方程系统的正向求解器)和神经网络的管线,后者用于从模型生成数据中提取参数。二者结合形成强大工具,可快速估计模型参数密度,即便对于超大规模系统同样适用。我们在传染病传播SIR模型的合成时间序列数据上验证该方法,并对网络经济活动Harris-Wilson模型(属于非凸优化问题)进行深入分析。针对后者,我们将方法应用于合成数据及大伦敦地区经济活动真实数据。结果表明,相比采用经典技术的同类数据集既往研究,我们的方法在模型标定精度上提高了数个数量级,同时运行速度提升195至390倍。