Reliably estimating the uncertainty of a prediction throughout the model lifecycle is crucial in many safety-critical applications. The most common way to measure this uncertainty is via the predicted confidence. While this tends to work well for in-domain samples, these estimates are unreliable under domain drift and restricted to classification. Alternatively, proper scores can be used for most predictive tasks but a bias-variance decomposition for model uncertainty does not exist in the current literature. In this work we introduce a general bias-variance decomposition for proper scores, giving rise to the Bregman Information as the variance term. We discover how exponential families and the classification log-likelihood are special cases and provide novel formulations. Surprisingly, we can express the classification case purely in the logit space. We showcase the practical relevance of this decomposition on several downstream tasks, including model ensembles and confidence regions. Further, we demonstrate how different approximations of the instance-level Bregman Information allow reliable out-of-distribution detection for all degrees of domain drift.
翻译:在众多安全关键应用中,可靠地估计模型全生命周期内预测的不确定性至关重要。衡量这种不确定性最常用的方法是通过预测置信度。虽然这种方法对域内样本通常有效,但在域漂移情况下估计结果不可靠,且仅适用于分类任务。另一种方法是采用恰当评分函数处理大多数预测任务,但当前文献中尚未建立针对模型不确定性的偏差-方差分解框架。本研究提出一种面向恰当评分函数的广义偏差-方差分解,其中方差项由Bregman信息量刻画。我们发现指数族分布与分类对数似然函数作为特例,并推导出新的数学表达形式。令人惊讶的是,分类情形可在对数几率空间中完全表示。我们通过模型集成和置信区域等多个下游任务展示了该分解的实际应用价值。此外,实验表明,对实例级Bregman信息量的不同近似方法能可靠地检测各种程度域漂移下的分布外样本。