This paper explores a modern predictive uncertainty estimation approach, called evidential deep learning (EDL), in which a single neural network model is trained to learn a meta distribution over the predictive distribution by minimizing a specific objective function. Despite their strong empirical performance, recent studies by Bengs et al. identify a fundamental pitfall of the existing methods: the learned epistemic uncertainty may not vanish even in the infinite-sample limit. We corroborate the observation by providing a unifying view of a class of widely used objectives from the literature. Our analysis reveals that the EDL methods essentially train a meta distribution by minimizing a certain divergence measure between the distribution and a sample-size-independent target distribution, resulting in spurious epistemic uncertainty. Grounded in theoretical principles, we propose learning a consistent target distribution by modeling it with a mixture of Dirichlet distributions and learning via variational inference. Afterward, a final meta distribution model distills the learned uncertainty from the target model. Experimental results across various uncertainty-based downstream tasks demonstrate the superiority of our proposed method, and illustrate the practical implications arising from the consistency and inconsistency of learned epistemic uncertainty.
翻译:本文探讨了一种名为证据深度学习(EDL)的现代预测不确定性估计方法,该方法通过最小化特定目标函数来训练单个神经网络模型,使其学习预测分布上的元分布。尽管现有方法具有强大的实证性能,但Bengs等人近期的研究指出了其根本性缺陷:即便在无限样本极限下,学习到的认知不确定性也可能不会消失。我们通过提供文献中一类常用目标的统一视角,证实了这一观察结果。分析表明,EDL方法本质上是通过最小化分布与样本量无关的目标分布之间的某种散度度量来训练元分布,从而产生虚假的认知不确定性。基于理论原理,我们提出通过狄利克雷混合分布对目标分布进行建模,并采用变分推断进行学习,从而得到一致的目标分布。随后,最终元分布模型从目标模型中蒸馏出学习到的不确定性。在多种基于不确定性的下游任务上的实验结果表明,我们提出的方法具有优越性,并揭示了学习到的不确定性的一致性与不一致性所产生的实际影响。