PCS-UQ: Uncertainty Quantification via the Predictability-Computability-Stability Framework

As machine learning (ML) enters high-stakes domains, trustworthy uncertainty quantification (UQ) is essential for safety. In this paper we introduce PCS-UQ, a framework based on the Predictability, Computability, and Stability (PCS) principles for veridical data science. Starting with a candidate set of models or algorithms, PCS-UQ integrates a rigorous prediction-check to screen out unsuitable models in the set and utilizes bootstrap samples, in order to capture both inter-sample variability and algorithmic instability for the prediction-checked algorithms. We then introduce a novel multiplicative calibration scheme to enhance local adaptivity, which basically corresponds to a new score in conformal prediction. Moreover, we produce a compilation of 17 real-world regression datasets with manually-constructed subgroups. On this benchmark, PCS-UQ maintains the target coverage while outperforming or matching conformal methods equipped with oracle-selected algorithms in interval width. PCS-UQ achieves consistent subgroup coverage, outperforming these oracle-selected conformal methods. Notably, PCS-UQ stands out in achieving both competitive interval widths and consistent subgroup coverage.Across 6 classification datasets, PCS-UQ reduces prediction set sizes by 20\%. To scale the framework for deep learning, we propose computationally efficient variants that bypass expensive retraining. On three computer vision benchmarks, these variants reduce prediction set sizes by 20\% over conformal baselines. Finally, we provide theoretical proof that a modified PCS-UQ algorithm preserves valid coverage under exchangeability as a form of split conformal inference.

翻译：随着机器学习进入高风险领域，可信赖的不确定性量化对安全性至关重要。本文提出PCS-UQ框架，该框架基于真实数据科学的可预测性、可计算性和稳定性原则。从候选模型或算法集出发，PCS-UQ通过严格的预测校验筛选出集合中不合适的模型，并利用自助采样法捕捉预测校验后算法的样本间变异与算法不稳定性。随后，我们引入一种新颖的乘法校准方案以增强局部自适应性，该方案本质上对应保形预测中的新得分。此外，我们构建了包含17个真实世界回归数据集及人工构建子组的基准测试。在该基准上，PCS-UQ在维持目标覆盖的同时，在区间宽度上优于或匹配配备先知算法的保形方法。PCS-UQ实现了子组覆盖一致性，超越这些先知选择的保形方法。值得注意的是，PCS-UQ在同时实现竞争性区间宽度与子组覆盖一致性方面表现突出。在6个分类数据集上，PCS-UQ将预测集大小降低20%。为扩展该框架至深度学习，我们提出计算高效的变体，避免昂贵的重训练。在三个计算机视觉基准上，这些变体相比保形基线将预测集大小缩减20%。最后，我们提供理论证明：经修改的PCS-UQ算法作为分裂保形推断的一种形式，在可交换性假设下能保持有效覆盖。