In the context of selective inference, confidence envelopes for the false discoveries allow the user to select any subset of null hypotheses while having a statistical guarantee on the number of false discoveries in the selected set. Many constructions of such envelopes have been proposed recently, using local test families (Genovese and Wasserman, 2006; Goeman and Solari, 2011), paths (Katsevich and Ramdas, 2020) or interpolation (Blanchard et al., 2020a). All those methods have in common that they have been well-studied for the homogeneous case where all p-values under the null have a uniform distribution over [0, 1]. However, in many applications the data are heterogeneous and discrete, hence the p-values have heterogeneous, discrete distributions, and the previous constructions may incur a loss of power, in the sense that they over-estimate the number of false discoveries. In this paper, we bridge the previous constructions under the homogeneous case with new tools. We also apply these tools to propose several confidence envelopes based on tools tailored for heterogeneous data, like the Bretagnolle inequality, or a new variant of the Simes inequality. We compare these new envelopes to their homogeneous counterparts on simulated data.
翻译:在选择性推断的背景下,错误发现的置信包络允许用户选择任意零假设子集,同时对所选集合中的错误发现数量提供统计保证。近期已提出多种此类包络的构建方法,包括利用局部检验族(Genovese 和 Wasserman,2006;Goeman 和 Solari,2011)、路径(Katsevich 和 Ramdas,2020)或插值法(Blanchard 等,2020a)。这些方法的共同点是均已在同质情形下得到充分研究,即所有零假设下的p值在[0,1]上服从均匀分布。然而在实际应用中,数据往往具有异质性和离散性,导致p值呈现异质、离散的分布特征,因此先前的构建方法可能因高估错误发现数量而导致统计功效损失。本文通过引入新工具弥合了同质情形下既有构建方法间的差异,并应用这些工具提出了多种基于异质数据定制工具(如Bretagnolle不等式或Simes不等式的新变体)的置信包络。我们通过模拟数据将这些新包络与传统同质情形下的对应方法进行了比较。