Rare and Weak models for multiple hypothesis testing assume that only a small proportion of the tested hypotheses concern non-null effects and the individual effects are only moderately large, so they generally do not stand out individually, for example in a Bonferroni analysis. Such models have been studied in quite a few settings, for example in some cases studies focused on an underlying Gaussian means model for the hypotheses being tested; in some others, Poisson and Binomial. Such seemingly different models have asymptotically the following common structure. Summarizing the evidence of individual tests by the negative logarithm of its P-value, the model is asymptotically equivalent to a situation in which most negative log P-values have a standard exponential distribution but a small fraction of the P-values might have an alternative distribution which is approximately noncentral chisquared on one degree of freedom. This log-chisquared approximation is different from the log-normal approximation of Bahadur which is unsuitable for analyzing Rare and Weak multiple testing models. We characterize the asymptotic performance of global tests combining asymptotic log-chisquared P-values in terms of the chisquared mixture parameters: the scaling parameter controlling heteroscedasticity, the non-centrality parameter describing the effect size whenever it exists, and the parameter controlling the rarity of the non-null effects. In a phase space involving the last two parameters, we derive a region where all tests are asymptotically powerless. Outside of this region, the Berk-Jones and the Higher Criticism tests have maximal power. Inference techniques based on the minimal P-value, false-discovery rate controlling, and Fisher's combination test have sub-optimal asymptotic phase diagrams.
翻译:稀有/弱效应模型(Rare and Weak models)用于多重假设检验,假设被检验的假设中仅有小部分涉及非零效应,且单个效应仅中等大小,因此通常无法在Bonferroni分析等场景中单独凸显。此类模型已在多种情境下得到研究,例如部分研究聚焦于假设检验所基于的高斯均值模型,而另一些则关注泊松分布和二项分布模型。这些看似不同的模型在渐近意义上具有以下共同结构。通过将单个检验的证据总结为其p值的负对数,该模型渐近等价于:大多数负对数p值服从标准指数分布,但少量p值可能服从近似于自由度为1的非中心卡方分布的备择分布。这种对数卡方近似不同于Bahadur提出的对数正态近似,后者并不适用于稀有/弱多重检验模型的分析。我们基于卡方混合参数刻画了组合渐近对数卡方p值的全局检验的渐近性能:控制异方差性的缩放参数、描述效应量(若存在)的非中心参数,以及控制非零效应稀有度的参数。在包含后两个参数的相空间中,我们推导出所有检验均渐近无效的区域。在该区域之外,Berk-Jones检验和Higher Criticism检验具有最大功效。基于最小p值、错误发现率控制和Fisher组合检验的推断技术则具有次优的渐近相图。