We show that a maximum likelihood approach for parameter estimation in agent-based models (ABMs) of opinion dynamics outperforms the typical simulation-based approach. Simulation-based approaches simulate the model repeatedly in search of a set of parameters that generates data similar enough to the observed one. In contrast, likelihood-based approaches derive a likelihood function that connects the unknown parameters to the observed data in a statistically principled way. We compare these two approaches on the well-known bounded-confidence model of opinion dynamics. We do so on three realistic scenarios of increasing complexity depending on data availability: (i) fully observed opinions and interactions, (ii) partially observed interactions, (iii) observed interactions with noisy proxies of the opinions. We highlight how identifying observed and latent variables is fundamental for connecting the model to the data. To realize the likelihood-based approach, we first cast the model into a probabilistic generative guise that supports a proper data likelihood. Then, we describe the three scenarios via probabilistic graphical models and show the nuances that go into translating the model. Finally, we implement the resulting probabilistic models in an automatic differentiation framework (PyTorch). This step enables easy and efficient maximum likelihood estimation via gradient descent. Our experimental results show that the maximum likelihood estimates are up to 4x more accurate and require up to 200x less computational time.
翻译:我们证明,在基于智能体的舆论动力学模型中,最大似然方法的参数估计性能优于典型的基于模拟的方法。基于模拟的方法通过反复模拟模型来寻找一组能生成与观测数据足够相似数据的参数。相比之下,基于似然的方法以统计上严谨的方式推导出连接未知参数与观测数据的似然函数。我们以著名的有界置信舆论动力学模型为例,比较了这两种方法。根据数据可用性,我们在三种复杂度递增的现实场景中进行了比较:(i) 完全观测的意见与交互,(ii) 部分观测的交互,(iii) 基于噪声代理变量观测的交互。我们强调识别观测变量与潜在变量对于连接模型与数据至关重要。为实施基于似然的方法,我们首先将模型转化为支持恰当数据似然的概率生成形式。随后,通过概率图模型描述三种场景,并展示模型转化中的细微差别。最后,我们在自动微分框架(PyTorch)中实现所得的概率模型,从而通过梯度下降实现简便高效的最大似然估计。实验结果表明,最大似然估计的精度提升高达4倍,计算时间减少多达200倍。