A pervasive methodological error is the post-hoc interpretation of $p$-values. A $p$-value $p$ is the smallest significance level at which we would have rejected the null had we chosen level $p$. It is not the smallest significance level at which we reject the null. We introduce post-hoc $p$-values, that do admit such a post-hoc interpretation. We show that $p$ is a post-hoc $p$-value if and only if $1/p$ is an $e$-value, a recently introduced statistical object. The product of independent post-hoc $p$-values is a post-hoc $p$-value, making them easy to combine. Moreover, any post-hoc $p$-value can be trivially improved if we permit external randomization, but only (essentially) non-randomized post-hoc $p$-values can be arbitrarily merged through multiplication. In addition, we discuss what constitutes a `good' post-hoc $p$-value. Finally, we argue that post-hoc $p$-values eliminate the need of a pre-specified significance level, such as $\alpha = .05$ or $\alpha = .005$ (Benjamin et al., 2018). We believe this may take away incentives for $p$-hacking and contribute to solving the file-drawer problem, as both these issues arise from using a pre-specified significance level.
翻译:一个普遍存在的方法论错误是对$p$值的事后解释。$p$值$p$是如果我们选择了水平$p$,我们会拒绝原假设的最小显著性水平,而不是我们实际拒绝原假设的最小显著性水平。我们引入事后$p$值,它确实允许这种事后解释。我们证明$p$是事后$p$值当且仅当$1/p$是一个$e$值,这是一种最近引入的统计对象。独立事后$p$值的乘积仍是一个事后$p$值,这使得它们易于组合。此外,如果允许外部随机化,任何事后$p$值都可以被平凡地改进,但只有(本质上)非随机化的事后$p$值才能通过乘法任意合并。同时,我们讨论了什么构成一个“好”的事后$p$值。最后,我们论证事后$p$值消除了预先指定显著性水平的必要性,例如$\alpha = .05$或$\alpha = .005$(Benjamin et al., 2018)。我们相信这可能消除$p$值操纵的动机,并有助于解决文件抽屉问题,因为这两个问题都源于使用预先指定的显著性水平。