Applying a machine learning model for decision-making in the real world requires to distinguish what the model knows from what it does not. A critical factor in assessing the knowledge of a model is to quantify its predictive uncertainty. Predictive uncertainty is commonly measured by the entropy of the Bayesian model average (BMA) predictive distribution. Yet, the properness of this current measure of predictive uncertainty was recently questioned. We provide new insights regarding those limitations. Our analyses show that the current measure erroneously assumes that the BMA predictive distribution is equivalent to the predictive distribution of the true model that generated the dataset. Consequently, we introduce a theoretically grounded measure to overcome these limitations. We experimentally verify the benefits of our introduced measure of predictive uncertainty. We find that our introduced measure behaves more reasonably in controlled synthetic tasks. Moreover, our evaluations on ImageNet demonstrate that our introduced measure is advantageous in real-world applications utilizing predictive uncertainty.
翻译:将机器学习模型应用于现实世界决策时,需要区分模型已知与未知的内容。评估模型知识的关键因素之一是量化其预测不确定性。当前常用贝叶斯模型平均(BMA)预测分布的熵来衡量预测不确定性。然而,近期有研究对该预测不确定性度量的适当性提出质疑。本文对这些局限性提供了新的见解。分析表明,当前度量错误地假设BMA预测分布等同于生成数据集的真实模型的预测分布。为此,我们提出一种具有理论基础的度量方法以克服这些局限。通过实验验证,我们引入的预测不确定性度量在受控合成任务中表现更为合理。此外,在ImageNet上的评估表明,该度量在利用预测不确定性的实际应用中具有优势。