On the good reliability of an interval-based metric to validate prediction uncertainty for machine learning regression tasks

This short study presents an opportunistic approach to a (more) reliable validation method for prediction uncertainty average calibration. Considering that variance-based calibration metrics (ZMS, NLL, RCE...) are quite sensitive to the presence of heavy tails in the uncertainty and error distributions, a shift is proposed to an interval-based metric, the Prediction Interval Coverage Probability (PICP). It is shown on a large ensemble of molecular properties datasets that (1) sets of z-scores are well represented by Student's-$t(\nu)$ distributions, $\nu$ being the number of degrees of freedom; (2) accurate estimation of 95 $\%$ prediction intervals can be obtained by the simple $2\sigma$ rule for $\nu>3$; and (3) the resulting PICPs are more quickly and reliably tested than variance-based calibration metrics. Overall, this method enables to test 20 $\%$ more datasets than ZMS testing. Conditional calibration is also assessed using the PICP approach.

翻译：本研究提出了一种机遇性方法，旨在建立（更为）可靠的预测不确定性平均校准验证方法。考虑到基于方差的校准度量（ZMS、NLL、RCE等）对不确定性和误差分布中重尾现象的存在相当敏感，本文建议转向一种基于区间的度量——预测区间覆盖概率（PICP）。在大量分子性质数据集上的实验表明：（1）z分数集合能很好地用自由度为$\nu$的Student-$t(\nu)$分布描述；（2）当$\nu>3$时，通过简单的$2\sigma$准则即可获得准确的95$\%$预测区间估计；（3）由此得到的PICP比基于方差的校准度量能更快速、可靠地进行检验。总体而言，该方法相比ZMS检验能多测试20$\%$的数据集。研究还采用PICP方法对条件校准进行了评估。

相关内容

Machine Learning

关注 2249

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

牛津大学最新《计算代数拓扑》笔记书，107页pdf

专知会员服务

44+阅读 · 2022年2月17日

【TPAMI2020】目标检测中的不平衡问题:综述论文，34页pdf

专知会员服务

55+阅读 · 2020年3月16日