Uncertainty Quantification in Machine Learning has progressed to predicting the source of uncertainty in a prediction: Uncertainty from stochasticity in the data (aleatoric), or uncertainty from limitations of the model (epistemic). Generally, each uncertainty is evaluated in isolation, but this obscures the fact that they are often not truly disentangled. This work proposes a set of experiments to evaluate disentanglement of aleatoric and epistemic uncertainty, and uses these methods to compare two competing formulations for disentanglement (the Information Theoretic approach, and the Gaussian Logits approach). The results suggest that the Information Theoretic approach gives better disentanglement, but that either predicted source of uncertainty is still largely contaminated by the other for both methods. We conclude that with the current methods for disentangling, aleatoric and epistemic uncertainty are not reliably separated, and we provide a clear set of experimental criteria that good uncertainty disentanglement should follow.
翻译:机器学习中的不确定性量化已发展到预测预测中不确定性的来源:数据随机性(偶然性)或模型局限性(认知性)带来的不确定性。通常,每种不确定性被单独评估,但这掩盖了它们往往并未真正解耦的事实。本研究提出了一系列实验来评估偶然性与认知性不确定性的解耦程度,并利用这些方法比较了两种竞争性的解耦表述(信息论方法与高斯逻辑方法)。结果表明,信息论方法能实现更好的解耦,但两种方法预测的不确定性来源仍很大程度上受到另一种不确定性的污染。我们得出结论:在当前解耦方法下,偶然性与认知性不确定性无法被可靠分离,并提出了一套清晰的良好不确定性解耦应遵循的实验准则。