Bayes factor null hypothesis tests provide a viable alternative to frequentist measures of evidence quantification. Bayes factors for realistic interesting models cannot be calculated exactly, but have to be estimated, which involves approximations to complex integrals. Crucially, the accuracy of these estimates, i.e., whether an estimated Bayes factor corresponds to the true Bayes factor, is unknown, and may depend on data, prior, and likelihood. We have recently developed a novel statistical procedure, namely simulation-based calibration (SBC) for Bayes factors, to test for a given analysis, whether the computed Bayes factors are accurate. Here, we use SBC for Bayes factors to test for some common cognitive designs, whether Bayes factors are estimated accurately. We use the bridgesampling/brms packages as well as the BayesFactor package in R. We find that Bayes factor estimates are accurate and exhibit only little bias in Latin square designs with (a) random effects for subjects only and (b) for crossed random effects for subjects and items, but a single fixed-factor. However, Bayes factor estimates turn out biased and liberal in a 2x2 design with crossed random effects for subjects and items. These results suggest that researchers should test for their individual analysis, whether Bayes factor estimates are accurate. Moreover, future research is needed to determine the boundary conditions under which Bayes factor estimates are accurate or biased, as well as software development to improve estimation accuracy.
翻译:贝叶斯因子零假设检验为频率学派证据量化方法提供了一种可行的替代方案。对于现实中有意义的模型,贝叶斯因子无法精确计算,而需通过近似复杂积分进行估计。关键在于,这些估计的准确性——即估计贝叶斯因子是否对应真实贝叶斯因子——是未知的,且可能取决于数据、先验和似然函数。我们近期开发了一种新颖的统计程序,即基于模拟的贝叶斯因子校准(SBC),用于检验特定分析中计算出的贝叶斯因子是否准确。本文利用贝叶斯因子的SBC方法,检验了若干常见认知实验设计中贝叶斯因子估计的准确性。我们使用了R语言中的bridgesampling/brms包以及BayesFactor包。研究发现,在拉丁方设计中,当(a)仅对受试者设置随机效应,以及(b)对受试者和项目设置交叉随机效应但仅包含单一固定因子时,贝叶斯因子估计准确且偏差极小。然而,在2×2交叉随机效应(受试者与项目)设计中,贝叶斯因子估计出现偏差且过于宽松。这些结果表明,研究者应针对自身分析检验贝叶斯因子估计的准确性。此外,未来研究需明确贝叶斯因子估计准确或存在偏差的边界条件,并开发软件以提高估计精度。