Bayesian inference is optimal when the statistical model is well-specified, while outside this setting Bayesian inference can catastrophically fail; accordingly a wealth of post-Bayesian methodologies have been proposed. Predictively oriented (PrO) approaches lift the statistical model $P_θ$ to an (infinite) mixture model $\int P_θ\; \mathrm{d}Q(θ)$ and fit this predictive distribution via minimising an entropy-regularised objective functional. In the well-specified setting one expects the mixing distribution $Q$ to concentrate around the true data-generating parameter in the large data limit, while such singular concentration will typically not be observed if the model is misspecified. Our contribution is to demonstrate that one can empirically detect model misspecification by comparing the standard Bayesian posterior to the PrO `posterior' $Q$, providing a novel and widely-applicable diagnostic tool for the standard Bayesian workflow. To operationalise this, we present an efficient numerical algorithm based on variational gradient descent. A simulation study, and a more detailed case study involving a Bayesian inverse problem in seismology, confirm that model misspecification can be automatically detected using this framework.
翻译:当统计模型设定正确时,贝叶斯推断具有最优性,但若模型存在设定错误,贝叶斯推断可能产生灾难性失败。为此,学界已提出大量后贝叶斯方法论。预测导向方法将统计模型 $P_θ$ 提升为(无限)混合模型 $\int P_θ\; \mathrm{d}Q(θ)$,并通过最小化熵正则化目标函数来拟合该预测分布。在模型设定正确的情况下,当数据量趋于无穷时,混合分布 $Q$ 应集中收敛于真实数据生成参数;而若模型存在设定错误,通常不会出现此类奇异性集中现象。我们的贡献在于证明:通过比较标准贝叶斯后验与预测导向“后验”$Q$,可从经验上检测模型设定错误,从而为标准贝叶斯工作流提供一种新颖且广泛适用的诊断工具。为实现该诊断方法,我们提出了基于变分梯度下降的高效数值算法。模拟实验及包含地震学贝叶斯逆问题的详细案例分析均证实,该框架可自动检测模型设定错误。