There is recent interest in estimating the false discovery rate (FDR) with published p-values. However, there is little formal research that addresses the manner and extent to which the presumed selection, or publication, bias model impacts the bias and variance of FDR estimators. This manuscript provides general and closed-form expressions for the bias and variance of an established FDR estimator when the publication bias model (p<0.05) may or may not be correct. Expressions reveal that FDR estimates could be conservative or liberal, depending on how well a $p<0.05$ publication rule approximates the true selection mechanism. Analysis of a well-studied large-scale replication project in psychology, where selection model parameters are estimable, suggests that bias expressions are accurate in practice. Another well-studied collection of p-values mined from medical journal abstracts is used to illustrate how provided closed-form expressions may facilitate a simple sensitivity analysis when the goal is FDR estimation using selected p-values with unknown selection mechanism.
翻译:近年来,利用已发表p值估计错误发现率(FDR)的研究受到关注。然而,关于假定的选择偏倚或发表偏倚模型如何影响FDR估计量的偏倚与方差,目前尚缺乏系统性研究。本文针对既定FDR估计量,在发表偏倚模型(p<0.05)可能正确或错误的情况下,推导了其偏倚与方差的通用闭式表达式。分析表明,FDR估计结果可能保守或激进,具体取决于$p<0.05$的发表规则对真实选择机制的近似程度。通过对心理学领域一项经过深入研究的可估计选择模型参数的大规模重复实验进行分析,验证了偏倚表达式在实际应用中的准确性。此外,本文还利用从医学期刊摘要中提取的经典p值数据集,展示了当选择机制未知时,所推导的闭式表达式如何为基于选择后p值的FDR估计提供简便的敏感性分析框架。