Approximating Full Conformal Prediction at Scale via Influence Functions

Conformal prediction (CP) is a wrapper around traditional machine learning models, giving coverage guarantees under the sole assumption of exchangeability; in classification problems, for a chosen significance level $\varepsilon$, CP guarantees that the error rate is at most $\varepsilon$, irrespective of whether the underlying model is misspecified. However, the prohibitive computational costs of "full" CP led researchers to design scalable alternatives, which alas do not attain the same guarantees or statistical power of full CP. In this paper, we use influence functions to efficiently approximate full CP. We prove that our method is a consistent approximation of full CP, and empirically show that the approximation error becomes smaller as the training set increases; e.g., for $10^{3}$ training points the two methods output p-values that are $<10^{-3}$ apart: a negligible error for any practical application. Our methods enable scaling full CP to large real-world datasets. We compare our full CP approximation (ACP) to mainstream CP alternatives, and observe that our method is computationally competitive whilst enjoying the statistical predictive power of full CP.

翻译：共形预测（CP）是传统机器学习模型的一种封装方法，在仅假设可交换性的前提下提供覆盖保证；在分类问题中，对于选定的显著性水平 $\varepsilon$，CP 保证错误率至多为 $\varepsilon$，无论底层模型是否误设。然而，“完全”CP 的过高计算成本促使研究人员设计可扩展的替代方案，但这些方案无法达到完全 CP 的相同保证或统计功效。本文利用影响函数高效地近似完全 CP。我们证明该方法是对完全 CP 的一致近似，并经验性地表明，随着训练集增大，近似误差逐渐减小；例如，对于 $10^{3}$ 个训练点，两种方法输出的 p 值差异小于 $10^{-3}$：在任何实际应用中，这一误差均可忽略。我们的方法使得完全 CP 能够扩展到大规模真实数据集。我们将完全 CP 近似（ACP）与主流 CP 替代方案进行比较，观察到我们的方法在计算上具有竞争力，同时享有完全 CP 的统计预测能力。

相关内容

关注 1

这是第25届年度会议，讨论有约束计算的所有方面，包括理论、算法、环境、语言、模型、系统和应用，如决策、资源分配、调度、配置和规划。为了纪念25周年，吉恩·弗洛伊德创作了一本“虚拟卷”来庆祝这个系列会议。信息可以在这里找到。约束编程协会有本系列中以前的会议列表。CP 2019计划将包括展示关于约束技术的高质量科学论文。除了通常的技术轨道外，CP 2019年会议还将有主题轨道。每个赛道都有一个专门的小组委员会，以确保有能力的评审员将审查这些领域的人提交的论文。官网链接：https://cp2019.a4cp.org/index.html

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日