In this work, we consider the problem of building distribution-free prediction intervals with finite-sample conditional coverage guarantees. Conformal prediction (CP) is an increasingly popular framework for building prediction intervals with distribution-free guarantees, but these guarantees only ensure marginal coverage: the probability of coverage is averaged over a random draw of both the training and test data, meaning that there might be substantial undercoverage within certain subpopulations. Instead, ideally, we would want to have local coverage guarantees that hold for each possible value of the test point's features. While the impossibility of achieving pointwise local coverage is well established in the literature, many variants of conformal prediction algorithm show favorable local coverage properties empirically. Relaxing the definition of local coverage can allow for a theoretical understanding of this empirical phenomenon. We aim to bridge this gap between theoretical validation and empirical performance by proving achievable and interpretable guarantees for a relaxed notion of local coverage. Building on the localized CP method of Guan (2023) and the weighted CP framework of Tibshirani et al. (2019), we propose a new method, randomly-localized conformal prediction (RLCP), which returns prediction intervals that are not only marginally valid but also achieve a relaxed local coverage guarantee and guarantees under covariate shift. Through a series of simulations and real data experiments, we validate these coverage guarantees of RLCP while comparing it with the other local conformal prediction methods.
翻译:在本文中,我们考虑构建具有有限样本条件覆盖保证的分布自由预测区间问题。共形预测(CP)是日益流行的构建分布自由保证预测区间的框架,但这些保证仅确保边际覆盖:覆盖概率是训练数据和测试数据随机抽取的平均值,这意味着某些子群体可能存在显著的覆盖不足。理想情况下,我们希望能实现局部覆盖保证,即该保证对测试点特征的每个可能取值均成立。尽管文献中已明确证明逐点局部覆盖的不可行性,但许多共形预测算法的变体在实证中展现出良好的局部覆盖性质。放宽局部覆盖的定义有助于从理论上理解这一实证现象。我们旨在通过为松弛的局部覆盖概念提供可证明且可解释的保证,来弥合理论验证与实证性能之间的差距。基于Guan (2023)的局部化CP方法和Tibshirani等人 (2019)的加权CP框架,我们提出一种新方法——随机局部化共形预测(RLCP),该方法返回的预测区间不仅具有边际有效性,还能实现松弛的局部覆盖保证以及协变量偏移下的保证。通过一系列模拟实验和真实数据实验,我们在验证RLCP的覆盖保证同时,将其与其他局部共形预测方法进行比较。