We study split-conformal prediction for regression when the reported prediction set must be a single interval, at target marginal coverage $1-α$, where $α$ is the nominal miscoverage level. Under this reporting constraint, the natural conditional target is the shortest interval with conditional mass at least $1-α$, rather than an equal-tailed interval or a possibly disconnected high-probability set. We parameterize this single-interval oracle by a lower-tail allocation, which determines how the nominal miscoverage $α$ is split between the two endpoints, and propose tail-allocation conformalized quantile regression (TA-CQR). TA-CQR estimates this allocation by searching over quantile-defined cores and then applies nonnegative additive split-conformal calibration, retaining exact finite-sample marginal coverage under exchangeability. The main contribution is theoretical. We characterize the oracle geometry, including its highest-density interpretation under unimodality and the positive connectedness cost induced by disconnected highest-density sets. We prove local recovery of the selected allocation and core, establish that calibration radii are asymptotically negligible under endpoint-density conditions, and give a finite-sample calibrated length oracle inequality with explicit grid, endpoint-quantile estimation, and calibration-sampling terms. Simulations and real-data examples report coverage and length jointly.
翻译:我们研究了回归问题中的分裂共形预测,要求报告预测集为单一区间,目标边际覆盖率为$1-α$,其中$α$是名义误覆盖水平。在此报告约束下,自然条件目标不是等尾区间或可能非连通的高概率集,而是条件质量至少为$1-α$的最短区间。我们通过下尾分配参数化该单一区间最优解,该分配决定名义误覆盖$α$如何在两个端点之间分配,并提出了尾部分配共形分位数回归(TA-CQR)。TA-CQR通过搜索分位数定义的核心来估计此分配,然后应用非负加性分裂共形校准,在可交换性下保持精确的有限样本边际覆盖率。主要贡献在于理论层面。我们描述了最优几何特征,包括单峰性下的最高密度解释以及非连通最高密度集引起的正连接成本。我们证明了所选分配和核心的局部恢复性,建立了在端点密度条件下校准半径渐近可忽略的结论,并给出了一个包含显式网格、端点分位数估计和校准采样项的有限样本校准长度最优不等式。模拟和真实数据示例联合报告了覆盖率和区间长度。