The effective utilization of structural information in data while ensuring statistical validity poses a significant challenge in false discovery rate (FDR) analyses. Conformal inference provides rigorous theory for grounding complex machine learning methods without relying on strong assumptions or highly idealized models. However, existing conformal methods have limitations in handling structured multiple testing. This is because their validity requires the deployment of symmetric rules, which assume the exchangeability of data points and permutation-invariance of fitting algorithms. To overcome these limitations, we introduce the pseudo local index of significance (PLIS) procedure, which is capable of accommodating asymmetric rules and requires only pairwise exchangeability between the null conformity scores. We demonstrate that PLIS offers finite-sample guarantees in FDR control and the ability to assign higher weights to relevant data points. Numerical results confirm the effectiveness and robustness of PLIS and show improvements in power compared to existing model-free methods in various scenarios.
翻译:在保证统计有效性的前提下有效利用数据中的结构信息,是虚假发现率分析中的一项重大挑战。共形推断为复杂机器学习方法提供了严谨的理论基础,无需依赖强假设或高度理想化的模型。然而,现有共形方法在处理结构化多重检验方面存在局限性,原因在于其有效性要求使用对称规则,这依赖于数据点的可交换性和拟合算法的置换不变性。为克服这些限制,我们引入了伪局部显著性指标程序,该程序能够适应非对称规则,且仅需零假设共形得分之间的成对可交换性。我们证明,PLIS 在虚假发现率控制中可提供有限样本保证,并能将更高权重赋予相关数据点。数值结果证实了 PLIS 的有效性和鲁棒性,并显示其在多种场景下相较于现有无模型方法具有更高的统计功效。