We provide new false discovery proportion (FDP) confidence envelopes in several multiple testing settings relevant to modern high dimensional-data methods. We revisit the scenarios considered in the recent work of \cite{katsevich2020simultaneous}(top-$k$, preordered -- including knockoffs -- , online) with a particular emphasis on obtaining FDP bounds that have both non-asymptotical coverage and asymptotical consistency, i.e. converge below the desired level $\alpha$ when applied to a classical $\alpha$-level false discovery rate (FDR) controlling procedure. This way, we derive new bounds that provide improvements over existing ones, both theoretically and practically, and are suitable for situations where at least a moderate number of rejections is expected. These improvements are illustrated with numerical experiments and real data examples. In particular, the improvement is significant in the knockoff setting, which shows the impact of the method for practical use. As side results, we introduce a new confidence envelope for the empirical cumulative distribution function of i.i.d. uniform variables and we provide new power results in sparse cases, both being of independent interest.
翻译:我们为现代高维数据分析方法相关的多种多重检验场景提供了新的假发现比例(FDP)置信包络线。我们重新审视了近期工作\cite{katsevich2020simultaneous}(top-$k$、预排序——包括knockoffs——、在线)中考虑的场景,特别强调获得兼具非渐近覆盖性和渐近一致性的FDP界限,即当应用于经典的水平为$\alpha$的假发现率(FDR)控制程序时,该界限能收敛到期望水平$\alpha$以下。通过这种方式,我们推导出新的界限,在理论和实践上均优于现有方法,且适用于预期拒绝数量至少为中等水平的情况。这些改进通过数值实验和真实数据示例进行了说明。特别是在knockoff场景中,改进效果显著,展示了该方法在实际应用中的影响。作为附带结果,我们引入了独立同分布均匀变量经验累积分布函数的新置信包络线,并在稀疏情形下提供了新的功效结果,这两者均具有独立的研究价值。