Simultaneous testing of one hypothesis at multiple alpha levels can be performed within a conventional Neyman-Pearson framework. This is achieved by treating the hypothesis as a family of hypotheses, each member of which explicitly concerns test level as well as effect size. Such testing encourages researchers to think about error rates and strength of evidence in both the statistical design and reporting stages of a study. Here, we show that these multi-alpha level tests can deliver acceptable expected total error costs. We first present formulas for expected error costs from single alpha and multiple alpha level tests, given prior probabilities of effect sizes that have either dichotomous or continuous distributions. Error costs are tied to decisions, with different decisions assumed for each of the potential outcomes in the multi-alpha level case. Expected total costs for tests at single and multiple alpha levels are then compared with optimal costs. This comparison highlights how sensitive optimization is to estimated error costs and to assumptions about prevalence. Testing at multiple default thresholds removes the need to formally identify decisions, or to model costs and prevalence as required in optimization approaches. Although total expected error costs with this approach will not be optimal, our results suggest they may be lower, on average, than when so-called optimal test levels are based on mis-specified models.
翻译:在传统的Neyman-Pearson框架内,可以对同一假设进行多重α水平的同步检验。这通过将假设视为一个假设族来实现,该族的每个成员都明确涉及检验水平与效应量。此类检验促使研究者在统计设计与报告阶段同时考量错误率与证据强度。本文证明,这类多重α水平检验能够提供可接受的期望总错误成本。我们首先给出了在效应量先验概率服从二值或连续分布时,单一α水平与多重α水平检验的期望错误成本计算公式。错误成本与决策相关联,在多重α水平情形中,我们为每种潜在结果设定了不同的决策。随后将单一与多重α水平检验的期望总成本与最优成本进行比较。这一比较凸显了优化过程对估计错误成本及流行度假设的敏感性。采用多重默认阈值进行检验无需正式界定决策,也无需像优化方法那样对成本与流行度进行建模。尽管此方法的期望总错误成本并非最优,但我们的结果表明,其平均成本可能低于基于错误设定模型的所谓最优检验水平。