Uncertainty quantification, once a singular task, has evolved into a spectrum of tasks, including abstained prediction, out-of-distribution detection, and aleatoric uncertainty quantification. The latest goal is disentanglement: the construction of multiple estimators that are each tailored to one and only one task. Hence, there is a plethora of recent advances with different intentions - that often entirely deviate from practical behavior. This paper conducts a comprehensive evaluation of numerous uncertainty estimators across diverse tasks on ImageNet. We find that, despite promising theoretical endeavors, disentanglement is not yet achieved in practice. Additionally, we reveal which uncertainty estimators excel at which specific tasks, providing insights for practitioners and guiding future research toward task-centric and disentangled uncertainty estimation methods. Our code is available at https://github.com/bmucsanyi/bud.
翻译:不确定性量化,曾被视为单一任务,现已演变为涵盖弃权预测、分布外检测以及偶然不确定性量化等一系列任务。最新目标是实现解耦:构建多个估计器,每个估计器仅针对一个任务进行定制。因此,近期涌现了大量目标各异的进展——而这些进展往往与实际表现相去甚远。本文在ImageNet上对多种不确定性估计器进行了全面评估,覆盖不同任务。我们发现,尽管理论探索前景广阔,但实践中尚未实现解耦。此外,我们揭示了哪些不确定性估计器在哪些特定任务上表现优异,为实践者提供见解,并引导未来研究向以任务为中心的解耦不确定性估计方法发展。我们的代码见https://github.com/bmucsanyi/bud。