This paper introduces the Density-Calibrated Conformal Quantile Regression (CQR-d) method, a novel approach for constructing prediction intervals that adapts to varying uncertainty across the feature space. Building upon conformal quantile regression, CQR-d incorporates local information through a weighted combination of local and global conformity scores, where the weights are determined by local data density. We prove that CQR-d provides valid marginal coverage at level $1 - \alpha - \epsilon$, where $\epsilon$ represents a small tolerance from numerical optimization. Through extensive simulation studies and an application to the a heteroscedastic dataset available in R, we demonstrate that CQR-d maintains the desired coverage while producing substantially narrower prediction intervals compared to standard conformal quantile regression (CQR). Notably, in our application on heteroscedastic data, CQR-d achieves an $8.6\%$ reduction in average interval width while maintaining comparable coverage. The method's effectiveness is particularly pronounced in settings with clear local uncertainty patterns, making it a valuable tool for prediction tasks in heterogeneous data environments.
翻译:本文提出了一种名为密度校准的保形分位数回归(CQR-d)的新方法,用于构建能够适应特征空间中不确定性变化的预测区间。该方法基于保形分位数回归,通过局部与全局一致性得分的加权组合来融入局部信息,其中权重由局部数据密度决定。我们证明CQR-d在$1 - \alpha - \epsilon$水平上提供有效的边际覆盖,其中$\epsilon$代表数值优化引入的微小容差。通过大量模拟研究以及对R中一个异方差数据集的应用,我们证明CQR-d在保持期望覆盖水平的同时,相比标准保形分位数回归(CQR)能产生显著更窄的预测区间。值得注意的是,在异方差数据的应用中,CQR-d在保持相当覆盖水平的同时实现了平均区间宽度$8.6\%$的缩减。该方法在具有明显局部不确定性模式的场景中效果尤为显著,使其成为异构数据环境下预测任务的重要工具。