Parameter inference, i.e. inferring the posterior distribution of the parameters of a statistical model given some data, is a central problem to many scientific disciplines. Generative models can be used as an alternative to Markov Chain Monte Carlo methods for conducting posterior inference, both in likelihood-based and simulation-based problems. However, assessing the accuracy of posteriors encoded in generative models is not straightforward. In this paper, we introduce `Tests of Accuracy with Random Points' (TARP) coverage testing as a method to estimate coverage probabilities of generative posterior estimators. Our method differs from previously-existing coverage-based methods, which require posterior evaluations. We prove that our approach is necessary and sufficient to show that a posterior estimator is accurate. We demonstrate the method on a variety of synthetic examples, and show that TARP can be used to test the results of posterior inference analyses in high-dimensional spaces. We also show that our method can detect inaccurate inferences in cases where existing methods fail.
翻译:参数推断(即根据观测数据推断统计模型参数的后验分布)是众多科学领域的核心问题。生成模型可作为马尔可夫链蒙特卡洛方法的替代方案,用于基于似然法和基于模拟法的后验推断。然而,评估生成模型编码的后验精度并非易事。本文提出"随机点精度检验"(TARP)覆盖测试方法,用于估计生成后验估计器的覆盖概率。该方法不同于现有需要后验评估的覆盖检验方法。我们证明该方法对于验证后验估计器的准确性既必要又充分。通过在多种合成示例上的实验,我们展示了TARP可应用于高维空间的后验推断分析结果检验,并能检测出现有方法失效时的不准确推断结果。