Bayes factors for composite hypotheses have difficulty in encoding vague prior knowledge, leading to conflicts between objectivity and sensitivity including the Jeffreys-Lindley paradox. To address these issues we revisit the posterior Bayes factor, in which the posterior distribution from the data at hand is re-used in the Bayes factor for the same data. We argue that this is biased when calibrated against proper Bayes factors, but propose bias adjustments to allow interpretation on the same scale. In the important case of a regular normal model, the bias in log scale is half the number of parameters. The resulting empirical Bayes factor is closely related to the widely applicable information criterion. We develop test-based empirical Bayes factors for several standard tests and propose an extension to multiple testing closely related to the optimal discovery procedure. When only a P-value is available, such as in non-parametric tests, we obtain a Bayes factor calibration of 10p. We propose interpreting the strength of Bayes factors on a logarithmic scale with base 3.73, reflecting the sharpest distinction between weaker and stronger belief. Empirical Bayes factors are a frequentist-Bayesian compromise expressing an evidential view of hypothesis testing.
翻译:复合假设的贝叶斯因子在编码模糊先验信息时存在困难,导致客观性与敏感性之间的冲突(包括杰弗里斯-林德利悖论)。为解决这些问题,我们重新审视后验贝叶斯因子——即针对同一数据,将当前数据得到的后验分布重复用于贝叶斯因子计算。我们论证该方法在根据标准贝叶斯因子校准时存在偏差,但提出偏差调整方法以使其可在相同尺度上解释。在正则正态模型的重要情形下,对数尺度的偏差为参数数量的一半。由此得到的经验贝叶斯因子与广泛适用信息准则密切相关。我们为若干标准检验开发了基于检验的经验贝叶斯因子,并提出与最优发现程序密切相关的多重检验扩展方案。当仅能获得P值(如非参数检验中)时,我们得到贝叶斯因子标定公式10p。建议以底数为3.73的对数尺度解释贝叶斯因子的强度,该底数反映了较弱信念与较强信念之间的最鲜明区分。经验贝叶斯因子是频率学派与贝叶斯学派的一种折衷方案,体现了对假设检验的证据性观点。