Statistical power is a measure of the replicability of a categorical hypothesis test. Formally, it is the probability of detecting an effect, if there is a true effect present in the population. Hence, optimizing statistical power as a function of some parameters of a hypothesis test is desirable. However, for most hypothesis tests, the explicit functional form of statistical power for individual model parameters is unknown; but calculating power for a given set of values of those parameters is possible using simulated experiments. These simulated experiments are usually computationally expensive. Hence, developing the entire statistical power manifold using simulations can be very time-consuming. We propose a novel genetic algorithm-based framework for learning statistical power manifolds. For a multiple linear regression $F$-test, we show that the proposed algorithm/framework learns the statistical power manifold much faster as compared to a brute-force approach as the number of queries to the power oracle is significantly reduced. We also show that the quality of learning the manifold improves as the number of iterations increases for the genetic algorithm. Such tools are useful for evaluating statistical power trade-offs when researchers have little information regarding a priori best guesses of primary effect sizes of interest or how sampling variability in non-primary effects impacts power for primary ones.
翻译:统计功效是衡量分类假设检验可复现性的指标。形式上,它表示当总体中存在真实效应时检测到该效应的概率。因此,优化统计功效作为假设检验某些参数的函数是值得追求的。然而,对于大多数假设检验而言,统计功效关于单个模型参数的显式函数形式是未知的;但通过模拟实验可以计算给定参数值集下的统计功效。这些模拟实验通常计算代价高昂,因此使用模拟方法构建完整的统计功效流形可能非常耗时。我们提出了一种基于遗传算法的新框架用于学习统计功效流形。针对多元线性回归$F$检验,我们证明所提出的算法/框架学习统计功效流形的速度远快于暴力方法,因为对功效查询函数的调用次数显著减少。我们还表明,随着遗传算法迭代次数的增加,流形学习质量会提升。当研究者对主要效应量的事前最优猜测缺乏先验信息,或非主要效应的采样变异性如何影响主要效应统计功效了解不足时,此类工具对评估统计功效权衡尤为有用。