Conformal Prediction (CP) is a distribution-free uncertainty estimation framework that constructs prediction sets guaranteed to contain the true answer with a user-specified probability. Intuitively, the size of the prediction set encodes a general notion of uncertainty, with larger sets associated with higher degrees of uncertainty. In this work, we leverage information theory to connect conformal prediction to other notions of uncertainty. More precisely, we prove three different ways to upper bound the intrinsic uncertainty, as described by the conditional entropy of the target variable given the inputs, by combining CP with information theoretical inequalities. Moreover, we demonstrate two direct and useful applications of such connection between conformal prediction and information theory: (i) more principled and effective conformal training objectives that generalize previous approaches and enable end-to-end training of machine learning models from scratch, and (ii) a natural mechanism to incorporate side information into conformal prediction. We empirically validate both applications in centralized and federated learning settings, showing our theoretical results translate to lower inefficiency (average prediction set size) for popular CP methods.
翻译:共形预测(Conformal Prediction, CP)是一种无分布假设的不确定性估计框架,能够构建以用户指定概率保证包含真实答案的预测集。直观上,预测集的大小编码了一般化的不确定性概念,较大的集合对应较高的不确定性程度。本研究利用信息论将共形预测与其他不确定性概念联系起来。具体而言,我们通过结合共形预测与信息论不等式,证明了三种不同方式以界定固有不确定性(由给定输入条件下目标变量的条件熵描述)。此外,我们展示了共形预测与信息论之间这种关联的两个直接且实用的应用:(一)更规范且有效的共形训练目标,该目标泛化了先前方法并支持机器学习模型的端到端训练;(二)将辅助信息纳入共形预测的自然机制。我们在集中式与联邦学习场景中实证验证了这两种应用,表明我们的理论结果可转化为更低的低效性(即常见共形方法的平均预测集大小)。