The local false discovery rate (lfdr) of Efron et al. (2001) enjoys major conceptual and decision-theoretic advantages over the false discovery rate (FDR) as an error criterion in multiple testing, but is only well-defined in Bayesian models where the truth status of each null hypothesis is random. We define a frequentist counterpart to the lfdr based on the relative frequency of nulls at each point in the sample space. The frequentist lfdr is defined without reference to any prior, but preserves several important properties of the Bayesian lfdr: For continuous test statistics, $\text{lfdr}(t)$ gives the probability, conditional on observing some statistic equal to $t$, that the corresponding null hypothesis is true. Evaluating the lfdr at an individual test statistic also yields a calibrated forecast of whether its null hypothesis is true. Finally, thresholding the lfdr at $\frac{1}{1+\lambda}$ gives the best separable rejection rule under the weighted classification loss where Type I errors are $\lambda$ times as costly as Type II errors. The lfdr can be estimated efficiently using parametric or non-parametric methods, and a closely related error criterion can be provably controlled in finite samples under independence assumptions. Whereas the FDR measures the average quality of all discoveries in a given rejection region, our lfdr measures how the quality of discoveries varies across the rejection region, allowing for a more fine-grained analysis without requiring the introduction of a prior.
翻译:Efron等人(2001)提出的局部错误发现率(lfdr)作为多重检验中的错误准则,在概念和决策理论层面相较于错误发现率(FDR)具有显著优势,但其仅在贝叶斯模型(其中每个零假设的真实状态是随机的)中才有明确定义。我们基于样本空间中各点处零假设的相对频率,定义了lfdr的频率学派对应物。该频率学派lfdr的定义无需依赖任何先验分布,但保留了贝叶斯lfdr的几个重要性质:对于连续检验统计量,$\text{lfdr}(t)$ 给出了在观测到某个统计量等于 $t$ 的条件下,对应零假设为真的概率。在单个检验统计量处评估lfdr,也可得到关于其零假设是否为真的校准预测。最后,将lfdr在 $\frac{1}{1+\lambda}$ 处进行阈值化,可得到在加权分类损失(其中I类错误的代价是II类错误的 $\lambda$ 倍)下的最佳可分离拒绝规则。lfdr可以使用参数或非参数方法进行高效估计,并且在独立性假设下,一个密切相关的错误准则可以在有限样本中得到可证明的控制。FDR衡量给定拒绝域中所有发现的平均质量,而我们的lfdr衡量发现的质量在拒绝域内如何变化,从而允许进行更细粒度的分析,且无需引入先验分布。