Deep neural networks have achieved remarkable success across a variety of tasks, yet they often suffer from unreliable probability estimates. As a result, they can be overconfident in their predictions. Conformal Prediction (CP) offers a principled framework for uncertainty quantification, yielding prediction sets with rigorous coverage guarantees. Existing conformal training methods optimize for overall set size, but shaping the prediction sets in a class-conditional manner is not straightforward and typically requires prior knowledge of the data distribution. In this work, we introduce Class Adaptive Conformal Training (CaCT), which formulates conformal training as an augmented Lagrangian optimization problem that adaptively learns to shape prediction sets class-conditionally without making any distributional assumptions. Experiments on multiple benchmark datasets, including standard and long-tailed image recognition as well as text classification, demonstrate that CaCT consistently outperforms prior conformal training methods, producing significantly smaller and more informative prediction sets while maintaining the desired coverage guarantees.
翻译:深度神经网络在各种任务中取得了显著成功,但其概率估计往往不可靠,导致预测可能过于自信。一致性预测为不确定性量化提供了原则性框架,能够生成具有严格覆盖保证的预测集。现有的一致性训练方法主要优化整体集合大小,但以类别条件方式塑造预测集并不直接,通常需要数据分布的先验知识。本研究提出类别自适应一致性训练,将一致性训练构建为增广拉格朗日优化问题,无需任何分布假设即可自适应地学习按类别条件塑造预测集。在多个基准数据集(包括标准与长尾图像识别以及文本分类)上的实验表明,CaCT始终优于现有的一致性训练方法,在保持所需覆盖保证的同时,能产生显著更小且信息量更大的预测集。