We develop the theory of hypothesis testing based on the e-value, a notion of evidence that, unlike the p-value, allows for effortlessly combining results from several studies in the common scenario where the decision to perform a new study may depend on previous outcomes. Tests based on e-values are safe, i.e. they preserve Type-I error guarantees, under such optional continuation. We define growth-rate optimality (GRO) as an analogue of power in an optional continuation context, and we show how to construct GRO e-variables for general testing problems with composite null and alternative, emphasizing models with nuisance parameters. GRO e-values take the form of Bayes factors with special priors. We illustrate the theory using several classic examples including a one-sample safe t-test and the 2 x 2 contingency table. Sharing Fisherian, Neymanian and Jeffreys-Bayesian interpretations, e-values may provide a methodology acceptable to adherents of all three schools.
翻译:我们基于e值发展了假设检验理论。与p值不同,e值作为证据指标允许在常见场景中轻松合并多项研究结果——即决定开展新研究可能取决于先前结果的情况。基于e值的检验是安全的,即在可选择的持续监测下仍能保证I类错误控制。我们定义增长率最优性(GRO)作为可选持续监测背景下检验功效的对应概念,并展示如何为具有复合原假设与备择假设的一般检验问题构造GRO e变量,尤其关注含 nuisance 参数的模型。GRO e值采用具有特殊先验的贝叶斯因子形式。我们通过多个经典案例(包括单样本安全t检验和2×2列联表)阐释该理论。由于同时具备Fisher、Neyman和Jeffreys-Bayes学派解释视角,e值可能为这三派拥护者提供可接受的方法论体系。