Conformal prediction (CP) is a wrapper around traditional machine learning models, giving coverage guarantees under the sole assumption of exchangeability; in classification problems, for a chosen significance level $\varepsilon$, CP guarantees that the error rate is at most $\varepsilon$, irrespective of whether the underlying model is misspecified. However, the prohibitive computational costs of "full" CP led researchers to design scalable alternatives, which alas do not attain the same guarantees or statistical power of full CP. In this paper, we use influence functions to efficiently approximate full CP. We prove that our method is a consistent approximation of full CP, and empirically show that the approximation error becomes smaller as the training set increases; e.g., for $10^{3}$ training points the two methods output p-values that are $<10^{-3}$ apart: a negligible error for any practical application. Our methods enable scaling full CP to large real-world datasets. We compare our full CP approximation (ACP) to mainstream CP alternatives, and observe that our method is computationally competitive whilst enjoying the statistical predictive power of full CP.
翻译:共形预测(CP)是传统机器学习模型的一种封装方法,在仅假设可交换性的前提下提供覆盖保证;在分类问题中,对于选定的显著性水平 $\varepsilon$,CP 保证错误率至多为 $\varepsilon$,无论底层模型是否误设。然而,“完全”CP 的过高计算成本促使研究人员设计可扩展的替代方案,但这些方案无法达到完全 CP 的相同保证或统计功效。本文利用影响函数高效地近似完全 CP。我们证明该方法是对完全 CP 的一致近似,并经验性地表明,随着训练集增大,近似误差逐渐减小;例如,对于 $10^{3}$ 个训练点,两种方法输出的 p 值差异小于 $10^{-3}$:在任何实际应用中,这一误差均可忽略。我们的方法使得完全 CP 能够扩展到大规模真实数据集。我们将完全 CP 近似(ACP)与主流 CP 替代方案进行比较,观察到我们的方法在计算上具有竞争力,同时享有完全 CP 的统计预测能力。