Machine learning algorithms have grown in sophistication over the years and are increasingly deployed for real-life applications. However, when using machine learning techniques in practical settings, particularly in high-risk applications such as medicine and engineering, obtaining the failure probability of the predictive model is critical. We refer to this problem as the risk-assessment task. We focus on regression algorithms and the risk-assessment task of computing the probability of the true label lying inside an interval defined around the model's prediction. We solve the risk-assessment problem using the conformal prediction approach, which provides prediction intervals that are guaranteed to contain the true label with a given probability. Using this coverage property, we prove that our approximated failure probability is conservative in the sense that it is not lower than the true failure probability of the ML algorithm. We conduct extensive experiments to empirically study the accuracy of the proposed method for problems with and without covariate shift. Our analysis focuses on different modeling regimes, dataset sizes, and conformal prediction methodologies.
翻译:近年来,机器学习算法日益复杂,并越来越多地部署于实际应用场景。然而,在将机器学习技术应用于实践环境时,尤其是在医学和工程等高危领域,获取预测模型的失效概率至关重要。我们将此问题称为风险评估任务。本文聚焦于回归算法,其具体任务是计算真实标签落在模型预测值定义区间内的概率。我们采用共形预测方法来解决该风险评估问题,该方法能提供保证以给定概率包含真实标签的预测区间。利用这一覆盖特性,我们证明了所提出的近似失效概率具有保守性,即不会低于机器学习算法的真实失效概率。我们开展了大量实验,从经验角度研究了所提方法在存在和不存在协变量偏移两种情况下的准确性。我们的分析涵盖了不同的建模范式、数据集规模以及共形预测方法。