Inductive Conformal Prediction (ICP) provides a practical and effective approach for equipping deep learning models with uncertainty estimates in the form of set-valued predictions which are guaranteed to contain the ground truth with high probability. Despite the appeal of this coverage guarantee, these sets may not be efficient: the size and contents of the prediction sets are not directly controlled, and instead depend on the underlying model and choice of score function. To remedy this, recent work has proposed learning model and score function parameters using data to directly optimize the efficiency of the ICP prediction sets. While appealing, the generalization theory for such an approach is lacking: direct optimization of empirical efficiency may yield prediction sets that are either no longer efficient on test data, or no longer obtain the required coverage on test data. In this work, we use PAC-Bayes theory to obtain generalization bounds on both the coverage and the efficiency of set-valued predictors which can be directly optimized to maximize efficiency while satisfying a desired test coverage. In contrast to prior work, our framework allows us to utilize the entire calibration dataset to learn the parameters of the model and score function, instead of requiring a separate hold-out set for obtaining test-time coverage guarantees. We leverage these theoretical results to provide a practical algorithm for using calibration data to simultaneously fine-tune the parameters of a model and score function while guaranteeing test-time coverage and efficiency of the resulting prediction sets. We evaluate the approach on regression and classification tasks, and outperform baselines calibrated using a Hoeffding bound-based PAC guarantee on ICP, especially in the low-data regime.
翻译:归纳共形预测(ICP)为深度学习模型提供了一种实用且有效的方法,能够以集合值预测的形式赋予其不确定性估计,这些预测集保证以高概率包含真实值。尽管这种覆盖保证具有吸引力,但这些预测集可能效率不高:预测集的大小和内容无法直接控制,而是依赖于底层模型和评分函数的选择。为解决这一问题,近期研究提出利用数据学习模型和评分函数参数,以直接优化ICP预测集的效率。尽管这一方法颇具吸引力,但其泛化理论尚不完善:直接优化经验效率可能导致预测集在测试数据上不再高效,或无法满足测试数据所需的覆盖要求。本文利用PAC-Bayes理论推导了集合值预测器在覆盖率和效率上的泛化界,该界可直接优化以在满足期望测试覆盖率的同时最大化效率。与先前工作不同,我们的框架能够利用整个校准数据集来学习模型和评分函数参数,而无需单独保留一个验证集来获得测试时的覆盖保证。我们利用这些理论结果提出了一种实用算法,可在保证测试时预测集覆盖率和效率的同时,使用校准数据同步微调模型和评分函数的参数。我们在回归和分类任务上评估了该方法,并在低数据情况下显著优于基于Hoeffding界的PAC保证校准的ICP基线。