While conformal predictors reap the benefits of rigorous statistical guarantees on their error frequency, the size of their corresponding prediction sets is critical to their practical utility. Unfortunately, there is currently a lack of finite-sample analysis and guarantees for their prediction set sizes. To address this shortfall, we theoretically quantify the expected size of the prediction sets under the split conformal prediction framework. As this precise formulation cannot usually be calculated directly, we further derive point estimates and high-probability interval bounds that can be empirically computed, providing a practical method for characterizing the expected set size. We corroborate the efficacy of our results with experiments on real-world datasets for both regression and classification problems.
翻译:尽管共形预测器在其错误频率上享有严格统计保证的优势,但其对应预测集的大小对其实际效用至关重要。然而,目前尚缺乏关于预测集大小的有限样本分析与保证。为弥补这一不足,我们在分裂共形预测框架下从理论上量化了预测集的预期大小。由于这一精确公式通常无法直接计算,我们进一步推导出可经验计算的点估计和高概率区间边界,为描述预期集大小提供了实用方法。我们通过回归与分类问题的真实数据集实验,验证了研究结果的有效性。