We investigate the multiplicity model with m values of some test statistic independently drawn from a mixture of no effect (null) and positive effect (alternative), where we seek to identify, the alternative test results with a controlled error rate. We are interested in the case where the alternatives are rare. A number of multiple testing procedures filter the set of ordered p-values in order to eliminate the nulls. Such an approach can only work if the p-values originating from the alternatives form one or several identifiable clusters. The Benjamini and Hochberg (BH) method, for example, assumes that this cluster occurs in a small interval $(0,\Delta)$ and filters out all or most of the ordered p-values $p_{(r)}$ above a linear threshold $s \times r$. In repeated applications this filter controls the false discovery rate via the slope s. We propose a new adaptive filter that deletes the p-values from regions of uniform distribution. In cases where a single cluster remains, the p-values in an interval are declared alternatives, with the mid-point and the length of the interval chosen by controlling the data-dependent FDR at a desired level.
翻译:我们研究多重性模型,该模型包含m个独立抽取自无效应(零假设)与正效应(备择假设)混合分布的检验统计量值,目标是在控制错误率的前提下识别备择检验结果。我们重点关注备择假设稀疏的情形。现有多种多重检验方法通过过滤有序p值序列以剔除零假设,这类方法仅在备择假设来源的p值构成一个或多个可识别簇群时有效。例如,Benjamini-Hochberg(BH)方法假定该簇群出现在小区间$(0,\Delta)$内,并通过线性阈值$s \times r$过滤全部或大部分高于该阈值的有序p值$p_{(r)}$。在重复应用中,该过滤器通过斜率s控制错误发现率。我们提出一种自适应过滤器,通过删除均匀分布区域的p值来改进算法。当单一簇群保留时,区间内p值被判定为备择假设,区间中点和长度通过控制数据依赖的FDR在预期水平进行选择。