Easy Uncertainty Quantification (EasyUQ): Generating Predictive Distributions from Single-valued Model Output

How can we quantify uncertainty if our favorite computational tool - be it a numerical, a statistical, or a machine learning approach, or just any computer model - provides single-valued output only? In this article, we introduce the Easy Uncertainty Quantification (EasyUQ) technique, which transforms real-valued model output into calibrated statistical distributions, based solely on training data of model output-outcome pairs, without any need to access model input. In its basic form, EasyUQ is a special case of the recently introduced Isotonic Distributional Regression (IDR) technique that leverages the pool-adjacent-violators algorithm for nonparametric isotonic regression. EasyUQ yields discrete predictive distributions that are calibrated and optimal in finite samples, subject to stochastic monotonicity. The workflow is fully automated, without any need for tuning. The Smooth EasyUQ approach supplements IDR with kernel smoothing, to yield continuous predictive distributions that preserve key properties of the basic form, including both, stochastic monotonicity with respect to the original model output, and asymptotic consistency. For the selection of kernel parameters, we introduce multiple one-fit grid search, a computationally much less demanding approximation to leave-one-out cross-validation. We use simulation examples and forecast data from weather prediction to illustrate the techniques. In a study of benchmark problems from machine learning, we show how EasyUQ and Smooth EasyUQ can be integrated into the workflow of neural network learning and hyperparameter tuning, and find EasyUQ to be competitive with conformal prediction, as well as more elaborate input-based approaches.

翻译：当我们的计算工具——无论是数值方法、统计方法、机器学习方法，还是任何计算机模型——仅提供单值输出时，我们该如何量化不确定性？本文提出易不确定性量化（EasyUQ）技术，该技术仅基于模型输出-结果对的训练数据，无需访问模型输入，即可将实数值模型输出转化为校准后的统计分布。其基本形式是最近提出的等渗分布回归（IDR）技术的一个特例，利用相邻违规算法（PAV）进行非参数等渗回归。EasyUQ能在有限样本下生成校准且最优的离散预测分布，同时满足随机单调性。该工作流完全自动化，无需任何调参。平滑EasyUQ方法在IDR基础上引入核平滑，生成连续预测分布，同时保留基本形式的关键性质，包括对原始模型输出的随机单调性和渐近一致性。针对核参数选择，我们提出多重单次网格搜索——一种计算需求远低于留一交叉验证的近似方法。通过模拟算例和天气预报数据演示该技术。在机器学习基准问题研究中，我们展示了如何将EasyUQ和平滑EasyUQ集成到神经网络学习与超参数调优的工作流中，并发现EasyUQ在性能上与保角预测及更复杂的基于输入的方法相当。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/