The effective utilization of structural information in data while ensuring statistical validity poses a significant challenge in false discovery rate (FDR) analyses. Conformal inference provides rigorous theory for grounding complex machine learning methods without relying on strong assumptions or highly idealized models. However, existing conformal methods have limitations in handling structured multiple testing. This is because their validity requires the deployment of symmetric rules, which assume the exchangeability of data points and permutation-invariance of fitting algorithms. To overcome these limitations, we introduce the pseudo local index of significance (PLIS) procedure, which is capable of accommodating asymmetric rules and requires only pairwise exchangeability between the null conformity scores. We demonstrate that PLIS offers finite-sample guarantees in FDR control and the ability to assign higher weights to relevant data points. Numerical results confirm the effectiveness and robustness of PLIS and show improvements in power compared to existing model-free methods in various scenarios.
翻译:在错误发现率(FDR)分析中,如何在确保统计有效性的同时充分利用数据的结构信息,是一个重大挑战。保形推断为复杂机器学习方法提供了严格的理论基础,且无需依赖强假设或高度理想化的模型。然而,现有的保形方法在处理结构化多重检验时存在局限,这是因为其有效性要求采用对称规则,即假设数据点的可交换性以及拟合算法的置换不变性。为克服这些限制,我们提出了伪局部显著性指数(PLIS)方法,该方法能够容纳非对称规则,且仅要求零假设保形分数之间满足两两可交换性。我们证明PLIS在FDR控制方面具有有限样本保证,并具备对相关数据点赋予更高权重的能力。数值结果验证了PLIS的有效性与鲁棒性,并显示其在多种场景下相比现有无模型方法具有更高的检验功效。