The perspective of developing trustworthy AI for critical applications in science and engineering requires machine learning techniques that are capable of estimating their own uncertainty. In the context of regression, instead of estimating a conditional mean, this can be achieved by producing a predictive interval for the output, or to even learn a model of the conditional probability $p(y|x)$ of an output $y$ given input features $x$. While this can be done under parametric assumptions with, e.g. generalized linear model, these are typically too strong, and non-parametric models offer flexible alternatives. In particular, for scalar outputs, learning directly a model of the conditional cumulative distribution function of $y$ given $x$ can lead to more precise probabilistic estimates, and the use of proper scoring rules such as the weighted interval score (WIS) and the continuous ranked probability score (CRPS) lead to better coverage and calibration properties. This paper introduces novel algorithms for learning probabilistic regression trees for the WIS or CRPS loss functions. These algorithms are made computationally efficient thanks to an appropriate use of known data structures - namely min-max heaps, weight-balanced binary trees and Fenwick trees. Through numerical experiments, we demonstrate that the performance of our methods is competitive with alternative approaches. Additionally, our methods benefit from the inherent interpretability and explainability of trees. As a by-product, we show how our trees can be used in the context of conformal prediction and explain why they are particularly well-suited for achieving group-conditional coverage guarantees.
翻译:在科学与工程关键应用中开发可信赖人工智能的前景,要求机器学习技术能够估计自身的不确定性。在回归问题中,除了估计条件均值外,这可以通过生成输出的预测区间实现,甚至学习输出$y$在给定输入特征$x$下的条件概率$p(y|x)$模型。虽然这可以在参数假设下(如广义线性模型)完成,但此类假设通常过于严格,而非参数模型提供了灵活的替代方案。特别地,对于标量输出,直接学习$y$在给定$x$下的条件累积分布函数模型可得到更精确的概率估计,而使用加权区间分数(WIS)和连续排序概率分数(CRPS)等适当评分规则可改善覆盖性和校准性质。本文针对WIS或CRPS损失函数提出学习概率回归树的新算法。这些算法通过恰当利用已知数据结构(即最小-最大堆、权重平衡二叉树和Fenwick树)实现计算高效。数值实验表明,我们的方法性能与替代方法相竞争。此外,我们的方法继承了树模型固有的可解释性与可说明性。作为副产品,我们展示了如何将这些树用于共形预测场景,并解释其为何特别适合实现分组条件覆盖保证。