The mean field variational Bayes (VB) algorithm implemented in Stan is relatively fast and efficient, making it feasible to produce model-estimated official statistics on a rapid timeline. Yet, while consistent point estimates of parameters are achieved for continuous data models, the mean field approximation often produces inaccurate uncertainty quantification to the extent that parameters are correlated a posteriori. In this paper, we propose a simulation procedure that calibrates uncertainty intervals for model parameters estimated under approximate algorithms to achieve nominal coverages. Our procedure detects and corrects biased estimation of both first and second moments of approximate marginal posterior distributions induced by any estimation algorithm that produces consistent first moments under specification of the correct model. The method generates replicate datasets using parameters estimated in an initial model run. The model is subsequently re-estimated on each replicate dataset, and we use the empirical distribution over the re-samples to formulate calibrated confidence intervals of parameter estimates of the initial model run that are guaranteed to asymptotically achieve nominal coverage. We demonstrate the performance of our procedure in Monte Carlo simulation study and apply it to real data from the Current Employment Statistics survey.
翻译:在Stan中实现的平均场变分贝叶斯(VB)算法具有较快的计算速度和较高的效率,这使得在快速时间线上生成模型估计的官方统计数据成为可能。然而,尽管对于连续数据模型可以获得参数的一致性点估计,但平均场近似常常会产生不准确的不确定性量化,其程度与参数的后验相关性有关。在本文中,我们提出一种模拟程序,用于校准通过近似算法估计的模型参数的不确定性区间,以达到名义覆盖水平。我们的程序能够检测并校正由任何在正确模型设定下能产生一致性一阶矩的估计算法所诱导的近似边缘后验分布的一阶矩和二阶矩的有偏估计。该方法使用初始模型运行中估计的参数生成重复数据集。随后在每个重复数据集上重新估计模型,并利用重采样上的经验分布来构建初始模型运行参数估计的校准置信区间,这些区间被保证能渐近地达到名义覆盖水平。我们通过蒙特卡洛模拟研究展示了该程序的性能,并将其应用于当前就业统计调查的真实数据。