This paper is concerned with estimating the local false discovery rate (lfdr) in a two-groups model where the only assumption regarding the null distribution is symmetry about zero. Our motivation comes from the contemporary framework for multiple hypothesis testing, particularly relevant in variable selection problems, which transforms any user-specified scores into statistics whose null distributions are symmetric about zero, whereas enrichment to the right of zero is generally expected for the non-nulls. While modern methods such as the knockoff filter (Barber and Candes; 2015) are able to exploit the null property for controlling the false discovery rate (FDR), an arguably more appropriate goal is to target control of the local false discovery rate for the rejected hypotheses, as proposed in Soloff et al. (2024) where the standard two-groups model (known $f_0$ and independence) is analyzed. Here, we take a step in this direction and propose to estimate the lfdr by targeting the surrogate density ratio $f(-w)/f(w)$, for $w>0$, where $f$ is the marginal density in the aforementioned ``stripped-down'' two-groups model. We study several estimators and propose a logistic regression based method with natural cubic spline basis. We also show that any consistent estimator of this surrogate yields asymptotic lfdr control of the multiple testing procedure that thresholds the estimate at the nominal level.
翻译:本文研究在仅假设零分布关于零对称的两组模型下,局部错误发现率(lfdr)的估计问题。我们的研究动机源于现代多重假设检验框架,该框架在变量选择问题中尤为重要,能将用户指定的任意评分转化为统计量,其零分布关于零对称,而非零统计量通常预期在零右侧富集。尽管如knockoff滤波器(Barber and Candes; 2015)等现代方法能够利用零分布的对称性来控制错误发现率(FDR),但更具针对性的目标是对被拒绝假设的局部错误发现率进行控制,如Soloff等人(2024)所提出的,该研究分析了标准两组模型(已知$f_0$且观测独立)。在此,我们朝此方向迈进一步,提出通过估计替代密度比$f(-w)/f(w)$(其中$w>0$,$f$为上述“简化版”两组模型中的边际密度)来估计lfdr。我们研究了几种估计量,并提出一种基于自然三次样条基函数的逻辑回归方法。同时证明,该替代密度比的任何一致估计量都能使基于名义水平阈值进行多重检验的程序渐近地控制lfdr。