Estimating the size of hidden populations using Multiple Systems Estimation (MSE) is a critical task in quantitative sociology; however, practical application is often hindered by imperfect administrative data and computational constraints. Real-world datasets frequently suffer from censoring and missingness due to privacy concerns, while standard inference methods, such as Maximum Likelihood Estimation (MLE) and Markov chain Monte Carlo (MCMC), can become computationally intractable or fail to converge when data are sparse. To address these limitations, we propose a novel simulation-based Bayesian inference framework utilizing Neural Bayes Estimators (NBE) and Neural Posterior Estimators (NPE). These neural methods are amortized: once trained, they provide instantaneous, computationally efficient posterior estimates, making them ideal for use in secure research environments where computational resources are limited. Through extensive simulation studies, we demonstrate that neural estimators achieve accuracy comparable to MCMC while being orders of magnitude faster and robust to the convergence failures that plague traditional samplers in sparse settings. We demonstrate our method on two real-world cases estimating the prevalence of modern slavery in the UK and female drug use in North East England.
翻译:利用多元系统估计(MSE)评估隐藏人口规模是定量社会学中的一项关键任务;然而,实际应用常因行政数据不完善与计算限制而受阻。现实世界数据集常因隐私问题遭受删失与缺失,而标准推断方法(如最大似然估计(MLE)和马尔可夫链蒙特卡洛(MCMC))在数据稀疏时可能变得计算不可行或无法收敛。为应对这些局限,我们提出了一种基于模拟的贝叶斯推断新框架,该框架利用神经贝叶斯估计器(NBE)与神经后验估计器(NPE)。这些神经方法是摊销式的:一旦训练完成,即可提供即时、计算高效的后验估计,使其特别适用于计算资源有限的保密研究环境。通过大量模拟研究,我们证明神经估计器能达到与MCMC相当的精度,同时速度快数个数量级,并且对稀疏场景下传统采样器常见的收敛失败问题具有鲁棒性。我们在两个实际案例中验证了该方法:评估英国现代奴隶制流行率与英格兰东北部女性药物使用情况。