While conformal predictors reap the benefits of rigorous statistical guarantees on their error frequency, the size of their corresponding prediction sets is critical to their practical utility. Unfortunately, there is currently a lack of finite-sample analysis and guarantees for their prediction set sizes. To address this shortfall, we theoretically quantify the expected size of the prediction sets under the split conformal prediction framework. As this precise formulation cannot usually be calculated directly, we further derive point estimates and high-probability interval bounds that can be empirically computed, providing a practical method for characterizing the expected set size. We corroborate the efficacy of our results with experiments on real-world datasets for both regression and classification problems.
翻译:尽管共形预测器能对其错误频率提供严格的统计保证,但相应预测集的大小对其实际效用至关重要。遗憾的是,目前尚缺乏关于预测集大小的有限样本分析与保证。为弥补这一不足,我们在分裂共形预测框架下从理论上量化了预测集的期望大小。由于这种精确表达式通常无法直接计算,我们进一步推导了可经验计算的点估计和高概率区间界限,从而提供了一种刻画期望集大小的实用方法。通过回归与分类问题的真实数据集实验,我们验证了所得结果的有效性。