Conformal selection (CS) uses calibration data to identify test inputs whose unobserved outcomes are likely to satisfy a pre-specified minimal quality requirement, while controlling the false discovery rate (FDR). Existing methods fix the target FDR level before observing data, which prevents the user from adapting the balance between number of selected test inputs and FDR to downstream needs and constraints based on the available data. For example, in genomics or neuroimaging, researchers often inspect the distribution of test statistics, and decide how aggressively to pursue candidates based on observed evidence strength and available follow-up resources. To address this limitation, we introduce {post-hoc CS} (PH-CS), which generates a path of candidate selection sets, each paired with a data-driven false discovery proportion (FDP) estimate. PH-CS lets the user select any operating point on this path by maximizing a user-specified utility, arbitrarily balancing selection size and FDR. Building on conformal e-variables and the e-Benjamini-Hochberg (e-BH) procedure, PH-CS is proved to provide a finite-sample post-hoc reliability guarantee whereby the ratio between estimated FDP level and true FDP is, on average, upper bounded by $1$, so that the average estimated FDP is, to first order, a valid upper bound on the true FDR. PH-CS is extended to control quality defined in terms of a general risk. Experiments on synthetic and real-world datasets demonstrate that, unlike CS, PH-CS can consistently satisfy user-imposed utility constraints while producing reliable FDP estimates and maintaining competitive FDR control.
翻译:共形选择(CS)利用校准数据识别未观测结果可能满足预设最低质量要求的测试输入,同时控制错误发现率(FDR)。现有方法在观测数据前固定目标FDR水平,这限制了用户根据下游需求和可用数据约束动态调整所选测试输入数量与FDR之间的平衡。例如,在基因组学或神经影像学中,研究者常通过检验统计量分布,根据观测证据强度和可用后续资源决定候选对象的筛选激进程度。为解决此局限,我们提出后验共形选择(PH-CS),该方法生成一条候选选择集路径,每个集合均配有数据驱动的错误发现比例(FDP)估计。PH-CS允许用户通过最大化自定义效用函数,任意平衡选择规模与FDR,从而在该路径上选择任意操作点。基于共形e-变量和e-本杰明-霍赫贝格(e-BH)过程,PH-CS被证明可提供有限样本下的后验可靠性保证:估计FDP水平与真实FDP的比率平均上界不超过1,因此一阶意义上平均估计FDP是真实FDR的有效上界。PH-CS进一步扩展至以一般风险定义的质量控制。在合成与真实数据集上的实验表明,与CS不同,PH-CS在产生可靠FDP估计并维持竞争性FDR控制的同时,能够持续满足用户施加的效用约束。