A review of predictive uncertainty estimation with machine learning

Predictions and forecasts of machine learning models should take the form of probability distributions, aiming to increase the quantity of information communicated to end users. Although applications of probabilistic prediction and forecasting with machine learning models in academia and industry are becoming more frequent, related concepts and methods have not been formalized and structured under a holistic view of the entire field. Here, we review the topic of predictive uncertainty estimation with machine learning algorithms, as well as the related metrics (consistent scoring functions and proper scoring rules) for assessing probabilistic predictions. The review covers a time period spanning from the introduction of early statistical (linear regression and time series models, based on Bayesian statistics or quantile regression) to recent machine learning algorithms (including generalized additive models for location, scale and shape, random forests, boosting and deep learning algorithms) that are more flexible by nature. The review of the progress in the field, expedites our understanding on how to develop new algorithms tailored to users' needs, since the latest advancements are based on some fundamental concepts applied to more complex algorithms. We conclude by classifying the material and discussing challenges that are becoming a hot topic of research.

翻译：机器学习模型的预测应以概率分布的形式呈现，旨在增加向最终用户传递的信息量。尽管在学术界和工业界，基于机器学习模型的概率预测应用日益频繁，但相关概念与方法尚未在全局视角下得到体系化的规范与整合。本文综述了基于机器学习算法的预测不确定性估计，以及用于评估概率预测的相关度量（一致评分函数与适当评分规则）。本综述的时间跨度从早期统计模型（基于贝叶斯统计或分位回归的线性回归与时间序列模型）延伸至近期更具灵活性的机器学习算法（包括位置、尺度与形状的广义可加模型、随机森林、提升算法与深度学习算法）。通过梳理该领域的研究进展，我们能够加深对如何根据用户需求开发新型算法的理解——最新进展均基于某些基础概念在更复杂算法中的应用。最后，我们通过分类整理相关文献并讨论当前研究热点所面临的挑战作为总结。

相关内容

Machine Learning

关注 2251

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

《多范式建模与仿真：系统工程视角》CMU 2022最新24页slides

专知会员服务

59+阅读 · 2022年11月4日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日