This paper introduces weighted conformal p-values for model-free selective inference. Assume we observe units with covariates $X$ and missing responses $Y$, the goal is to select those units whose responses are larger than some user-specified values while controlling the proportion of falsely selected units. We extend [JC22] to situations where there is a covariate shift between training and test samples, while making no modeling assumptions on the data, and having no restrictions on the model used to predict the responses. Using any predictive model, we first construct well-calibrated weighted conformal p-values, which control the type-I error in detecting a large response/outcome for each single unit. However, a well known positive dependence property between the p-values can be violated due to covariate-dependent weights, which complicates the use of classical multiple testing procedures. This is why we introduce weighted conformalized selection (WCS), a new multiple testing procedure which leverages a special conditional independence structure implied by weighted exchangeability to achieve FDR control in finite samples. Besides prediction-assisted candidate screening, we study how WCS (1) allows to conduct simultaneous inference on multiple individual treatment effects, and (2) extends to outlier detection when the distribution of reference inliers shifts from test inliers. We demonstrate performance via simulations and apply WCS to causal inference, drug discovery, and outlier detection datasets.
翻译:本文提出加权共形p值,用于无模型条件下的选择性推断。假设观测到含有协变量$X$和缺失响应$Y$的样本单元,目标是筛选出响应大于用户指定值的单元,同时控制误选比例。我们将[JC22]方法扩展到训练集与测试集存在协变量偏移的情形,不对数据做任何建模假设,也不限制用于预测响应的模型。利用任意预测模型,首先构建校准良好的加权共形p值,以控制每个单元检测到较大响应/结果时的第一类错误。然而,由于协变量依赖权重可能破坏p值间已知的正相关性,使得经典多重检验方法难以应用。为此,我们提出加权共形选择(WCS)——一种新的多重检验程序,利用加权交换性隐含的特殊条件独立结构,在有限样本下实现错误发现率控制。除预测辅助候选取样外,我们研究了WCS如何:(1)对多个个体处理效应进行同步推断;(2)在外推至参考内点分布与测试内点存在偏移时的异常检测场景。通过仿真验证性能,并将WCS应用于因果推断、药物发现和异常检测数据集。