In a multiple testing task, finding an appropriate estimator of the proportion $π_0$ of non-signal in the data to boost power of false discovery rate (FDR) controlling procedures is a long-standing research theme, sometimes referred to as 'adaptive FDR control'. The interest in this theme has been reinforced in the recent years with conformal novelty detection, for which it turns out that similar tools can be used in combination with any 'blackbox' machine learning algorithm. Nevertheless, perhaps surprisingly, finding a solution for 'adaptive FDR control' that is optimal in a broad sense is still an open problem. This paper fills this gap by introducing new $π_0$-estimators, referred to as min-Storey (MS) and interval-min-Storey (IMS), which are built upon the so-called 'Storey estimator'. Plugging these estimators in the adaptive Benjamini-Hochberg (BH) procedure is shown to deliver FDR control both in the independent and conformal settings. In addition, these methods satisfy an optimal power property over any (regular) alternative distribution. The excellent behaviors of the new adaptive procedures are illustrated with numerical experiments both in the independent and conformal models for various distribution structures.
翻译:在多重检验任务中,为提升错误发现率控制程序的统计功效,寻找数据中非信号比例π₀的恰当估计量是一个长期的研究主题,常被称为“自适应FDR控制”。近年来,随着共形新颖性检测的发展,此类工具可与任何“黑箱”机器学习算法结合使用,进一步强化了该主题的研究价值。然而,令人意外的是,在广泛意义上寻找最优的“自适应FDR控制”方案仍是一个开放问题。本文通过引入基于所谓“Storey估计量”的极小Storey与区间极小Storey两类新型π₀估计量填补了这一空白。研究表明,将这些估计量嵌入自适应Benjamini-Hochberg过程后,可在独立设定与共形设定下均实现FDR控制。此外,这些方法在任意正则备择分布上满足最优功效性质。通过数值实验,我们分别在独立模型与共形模型中针对多种分布结构验证了新自适应方法的卓越性能。