Conformalized Quantile Regression (CQR) is a recently proposed method for constructing prediction intervals for a response $Y$ given covariates $X$, without making distributional assumptions. However, as we demonstrate empirically, existing constructions of CQR can be ineffective for problems where the quantile regressors perform better in certain parts of the feature space than others. The reason is that the prediction intervals of CQR do not distinguish between two forms of uncertainty: first, the variability of the conditional distribution of $Y$ given $X$ (i.e., aleatoric uncertainty), and second, our uncertainty in estimating this conditional distribution (i.e., epistemic uncertainty). This can lead to uneven coverage, with intervals that are overly wide (or overly narrow) in regions where epistemic uncertainty is low (or high). To address this, we propose a new variant of the CQR methodology, Uncertainty-Aware CQR (UACQR), that explicitly separates these two sources of uncertainty to adjust quantile regressors differentially across the feature space. Compared to CQR, our methods enjoy the same distribution-free theoretical guarantees for coverage properties, while demonstrating in our experiments stronger conditional coverage in simulated settings and tighter intervals on a range of real-world data sets.
翻译:共形化分位数回归(CQR)是最近提出的一种方法,用于在给定协变量$X$的情况下构建响应$Y$的预测区间,无需假设分布。然而,正如我们通过实验证明的那样,当分位数回归器在特征空间的某些部分表现优于其他部分时,现有的CQR构造可能失效。原因在于CQR的预测区间未区分两种不确定性形式:第一,$Y$在给定$X$下的条件分布变异性(即偶然不确定性),第二,我们估计该条件分布时的不确定性(即认知不确定性)。这可能导致覆盖不均匀:在认知不确定性较低的区域区间过宽,而在认知不确定性较高的区域区间过窄。为解决这一问题,我们提出了一种新的CQR变体——不确定性感知CQR(UACQR),该变体明确分离这两种不确定性来源,从而在特征空间上对分位数回归器进行差异化调整。与CQR相比,我们的方法在覆盖性质上享有相同的无分布理论保证,同时在实验中对模拟环境下的条件覆盖更强,对一系列真实数据集上的区间更紧凑。