Conformal prediction is a framework for providing prediction intervals with distribution-free validity, guaranteeing predictive coverage for data drawn from any distribution. Its two main variants are full conformal prediction and split conformal prediction (also called transductive and inductive). Full conformal prediction is widely considered to be statistically more efficient (since split conformal prediction requires data splitting, and therefore can lead to wider prediction intervals due to the resulting loss in sample size), but its implementation is computationally prohibitive, as it requires the underlying model to be refit for every candidate value in the response space. Existing computational shortcuts, such as using a discrete grid of values to approximate the full conformal prediction construction, frequently lack theoretical guarantees on marginal coverage and can fail in practice. To address this limitation, we introduce a novel class of approximations to the full conformal prediction method, based on the idea of \emph{tournaments}, which enables the construction of prediction sets with a rigorous marginal coverage guarantee of $1-2α$. Under stability conditions, the theoretical coverage guarantee tightens to approximately $1-α$. This new framework generalizes the existing method of leave-one-out cross-conformal prediction, while allowing for flexible use of various existing approximation strategies.
翻译:共形预测是一种提供预测区间的框架,具有无分布有效性,能够为来自任何分布的数据保证预测覆盖。其两个主要变体是全共形预测和分裂共形预测(也称为转导式和归纳式)。全共形预测被广泛认为在统计上更高效(因为分裂共形预测需要数据拆分,由于样本量损失可能导致更宽的预测区间),但其实现计算开销巨大,因为需要对响应空间中每个候选值重新拟合底层模型。现有的计算简化方法(例如使用离散网格值近似全共形预测构造)通常缺乏边际覆盖的理论保证,并在实践中可能失效。为解决这一局限性,我们提出了一类基于“锦标赛”思想的全共形预测方法的新型近似方法,能够构建具有严格边际覆盖保证($1-2α$)的预测集。在稳定性条件下,理论覆盖保证趋紧至约$1-α$。这一新框架推广了现有的留一法交叉共形预测方法,同时允许灵活使用各种现有近似策略。