The posterior predictive $p$-value (ppp) is widely used in Bayesian model evaluation. However, due to double use of the data, the ppp may not be a valid $p$-value even in large samples: The asymptotic null distribution of the ppp can be non-uniform unless the underlying test statistic satisfies certain well-calibration conditions. Such conditions have been studied in the literature for asymptotically normal test statistics. We extend this line of work by establishing well-calibration conditions for test statistics that are not necessarily asymptotically normal. In particular, we show that Kolmogorov-Smirnov (KS)-type test statistics satisfy these conditions, such that their ppps are asymptotically well-calibrated Bayesian $p$-values. KS-type statistics are versatile, omnibus, and sensitive to model misspecifications. They apply to i.i.d. real-valued data, as well as non-identically distributed observations under regression models. Numerical experiments demonstrate that such $p$-values are well behaved in finite samples and can effectively detect a wide range of alternative models.
翻译:后验预测$p$值在贝叶斯模型评估中被广泛使用。然而,由于对数据的双重利用,即使在大样本情况下,后验预测$p$值也可能不是一个有效的$p$值:除非基础检验统计量满足特定的良校准条件,否则后验预测$p$值的渐近零分布可能非均匀。文献中已针对渐近正态检验统计量研究了此类条件。我们通过为非渐近正态检验统计量建立良校准条件,扩展了这项研究。具体而言,我们证明了Kolmogorov-Smirnov型检验统计量满足这些条件,因此其后验预测$p$值是渐近良校准的贝叶斯$p$值。KS型统计量具有通用性、综合性,且对模型设定错误敏感。它们适用于独立同分布的实值数据,以及回归模型下的非独立同分布观测。数值实验表明,此类$p$值在有限样本中表现良好,并能有效检测广泛的备择模型。