This work is concerned with conformal prediction in contemporary applications (including generative AI) where a black-box model has been trained on data that are not accessible to the user. Mirroring split-conformal inference, we design a wrapper around a black-box algorithm which calibrates conformity scores. This calibration is local and proceeds in two stages by first adaptively partitioning the predictor space into groups and then calibrating sectionally group by group. Adaptive partitioning (self-grouping) is achieved by fitting a robust regression tree to the conformity scores on the calibration set. This new tree variant is designed in such a way that adding a single new observation does not change the tree fit with overwhelmingly large probability. This add-one-in robustness property allows us to conclude a finite sample group-conditional coverage guarantee, a refinement of the marginal guarantee. In addition, unlike traditional split-conformal inference, adaptive splitting and within-group calibration yields adaptive bands which can stretch and shrink locally. We demonstrate benefits of local tightening on several simulated as well as real examples using non-parametric regression. Finally, we consider two contemporary classification applications for obtaining uncertainty quantification around GPT-4o predictions. We conformalize skin disease diagnoses based on self-reported symptoms as well as predicted states of U.S. legislators based on summaries of their ideology. We demonstrate substantial local tightening of the uncertainty sets while attaining similar marginal coverage.
翻译:本研究关注现代应用(包括生成式人工智能)中的保形预测问题,其中黑盒模型已在用户无法访问的数据上进行训练。借鉴分割保形推理的思想,我们设计了一种围绕黑盒算法的封装器,用于校准符合性分数。该校准过程具有局部性,分为两个阶段:首先通过自适应划分将预测空间分组,随后逐组进行分段校准。自适应划分(自分组)通过将稳健回归树拟合至校准集上的符合性分数来实现。这种新型树变体的设计使得添加单条新观测数据时,树拟合结果以极高概率保持不变。这种"单点添加稳健性"特性使我们能够推导出有限样本的组条件覆盖保证,这是对边际保证的改进。此外,与传统分割保形推理不同,自适应分割与组内校准能够产生可局部伸缩的自适应置信带。我们通过非参数回归在多个模拟及实际案例中展示了局部紧缩的优势。最后,我们探讨了两种现代分类应用,用于获取GPT-4o预测的不确定性量化:基于自报告症状的皮肤病诊断,以及基于意识形态摘要的美国立法者状态预测。实验表明,在保持相近边际覆盖率的同时,不确定性集合实现了显著的局部紧缩。