Traditional conformal prediction methods construct prediction sets such that the true label falls within the set with a user-specified coverage level. However, poorly chosen coverage levels can result in uninformative predictions, either producing overly conservative sets when the coverage level is too high, or empty sets when it is too low. Moreover, the fixed coverage level cannot adapt to the specific characteristics of each individual example, limiting the flexibility and efficiency of these methods. In this work, we leverage recent advances in e-values and post-hoc conformal inference, which allow the use of data-dependent coverage levels while maintaining valid statistical guarantees. We propose to optimize an adaptive coverage policy by training a neural network using a leave-one-out procedure on the calibration set, allowing the coverage level and the resulting prediction set size to vary with the difficulty of each individual example. We support our approach with theoretical coverage guarantees and demonstrate its practical benefits through a series of experiments.
翻译:传统共形预测方法构建预测集的方式是保证真实标签以用户指定的覆盖水平落在该集合中。然而,当覆盖水平设置不当(过高或过低)时,预测集可能变得无信息:过高会导致过度保守的集合,过低则可能产生空集。此外,固定的覆盖水平无法适应每个样本的具体特征,限制了这些方法的灵活性和效率。本文利用e值和事后共形推断的最新进展,允许使用依赖数据的覆盖水平同时保持有效的统计保证。我们提出通过留一法校准集训练神经网络来优化自适应覆盖策略,使覆盖水平及生成的预测集大小能够随每个样本的难度动态调整。我们为该方法提供了理论覆盖保证,并通过系列实验展示了其实用优势。