Data-driven surrogate models offer quick approximations to complex numerical and experimental systems but typically lack uncertainty quantification, limiting their reliability in safety-critical applications. While Bayesian methods provide uncertainty estimates, they offer no statistical guarantees and struggle with high-dimensional spatio-temporal problems due to computational costs. We present a conformal prediction (CP) framework that provides statistically guaranteed marginal coverage for surrogate models in a model-agnostic manner with near-zero computational cost. Our approach handles high-dimensional spatio-temporal outputs by performing cell-wise calibration while preserving the tensorial structure of predictions. Through extensive empirical evaluation across diverse applications including fluid dynamics, magnetohydrodynamics, weather forecasting, and fusion diagnostics, we demonstrate that CP achieves empirical coverage with valid error bars regardless of model architecture, training regime, or output dimensionality. We evaluate three nonconformity scores (conformalised quantile regression, absolute error residual, and standard deviation) for both deterministic and probabilistic models, showing that guaranteed coverage holds even for out-of-distribution predictions where models are deployed on physics regimes different from training data. Calibration requires only seconds to minutes on standard hardware. The framework enables rigorous validation of pre-trained surrogate models for downstream applications without retraining. While CP provides marginal rather than conditional coverage and assumes exchangeability between calibration and test data, our method circumvents the curse of dimensionality inherent in traditional uncertainty quantification approaches, offering a practical tool for trustworthy deployment of machine learning in physical sciences.
翻译:数据驱动的代理模型为复杂的数值和实验系统提供快速近似,但通常缺乏不确定性量化,这限制了其在安全关键应用中的可靠性。虽然贝叶斯方法能提供不确定性估计,但无法提供统计保证,且由于计算成本高昂,难以处理高维时空问题。我们提出一种保形预测(CP)框架,该框架以模型无关的方式为代理模型提供具有统计保证的边缘覆盖,且计算成本近乎为零。我们的方法通过执行单元级校准来处理高维时空输出,同时保持预测的张量结构。通过对流体动力学、磁流体动力学、天气预报和聚变诊断等多种应用进行广泛实证评估,我们证明无论模型架构、训练方式或输出维度如何,CP均能通过有效误差棒实现经验覆盖。我们针对确定性和概率性模型评估了三种非保形分数(保形化分位数回归、绝对误差残差和标准差),结果表明即使对于分布外预测(即模型部署于与训练数据不同的物理机制时),保证覆盖仍然成立。校准过程在标准硬件上仅需数秒至数分钟即可完成。该框架能够对预训练的代理模型进行严格验证,以用于下游应用而无需重新训练。虽然CP提供的是边缘覆盖而非条件覆盖,且假设校准数据与测试数据具有可交换性,但我们的方法规避了传统不确定性量化方法固有的维度灾难,为在物理科学中可信部署机器学习提供了一个实用工具。