The is no other model or hypothesis verification tool in Bayesian statistics that is as widely used as the Bayes factor. We focus on generative models that are likelihood-free and, therefore, render the computation of Bayes factors (marginal likelihood ratios) far from obvious. We propose a deep learning estimator of the Bayes factor based on simulated data from two competing models using the likelihood ratio trick. This estimator is devoid of summary statistics and obviates some of the difficulties with ABC model choice. We establish sufficient conditions for consistency of our Deep Bayes Factor estimator as well as its consistency as a model selection tool. We investigate the performance of our estimator on various examples using a wide range of quality metrics related to estimation and model decision accuracy. After training, our deep learning approach enables rapid evaluations of the Bayes factor estimator at any fictional data arriving from either hypothesized model, not just the observed data $Y_0$. This allows us to inspect entire Bayes factor distributions under the two models and to quantify the relative location of the Bayes factor evaluated at $Y_0$ in light of these distributions. Such tail area evaluations are not possible for Bayes factor estimators tailored to $Y_0$. We find the performance of our Deep Bayes Factors competitive with existing MCMC techniques that require the knowledge of the likelihood function. We also consider variants for posterior or intrinsic Bayes factors estimation. We demonstrate the usefulness of our approach on a relatively high-dimensional real data example about determining cognitive biases.
翻译:在贝叶斯统计中,没有任何模型或假设验证工具能像贝叶斯因子那样被广泛使用。我们关注无似然函数的生成模型,这类模型使得贝叶斯因子(边际似然比)的计算远非显而易见的。我们提出一种基于似然比技巧的深度学习估计器,该估计器利用两个竞争模型的模拟数据来估计贝叶斯因子。该估计器无需汇总统计量,并避免了ABC模型选择中的部分困难。我们建立了Deep Bayes Factor估计器一致性的充分条件,以及其作为模型选择工具一致性的充分条件。我们通过一系列与估计精度和模型决策准确性相关的质量指标,在多个示例中评估了该估计器的性能。在训练完成后,我们的深度学习方法能够对来自任一假设模型的任意虚拟数据(而非仅限于观测数据$Y_0$)快速评估贝叶斯因子估计值。这使得我们可以检验两个模型下贝叶斯因子的整体分布,并量化$Y_0$处贝叶斯因子在这些分布中的相对位置。这种尾部区域评估方法对于针对$Y_0$定制的贝叶斯因子估计器而言是无法实现的。我们发现Deep Bayes Factors的性能可与需要知道似然函数的现有MCMC技术相媲美。我们还考虑了后验或内在贝叶斯因子估计的变体。我们通过一个关于认知偏差判断的相对高维真实数据示例展示了该方法的实用性。