Accurate and efficient uncertainty estimation is crucial to build reliable Machine Learning (ML) models capable to provide calibrated uncertainty estimates, generalize and detect Out-Of-Distribution (OOD) datasets. To this end, Deterministic Uncertainty Methods (DUMs) is a promising model family capable to perform uncertainty estimation in a single forward pass. This work investigates important design choices in DUMs: (1) we show that training schemes decoupling the core architecture and the uncertainty head schemes can significantly improve uncertainty performances. (2) we demonstrate that the core architecture expressiveness is crucial for uncertainty performance and that additional architecture constraints to avoid feature collapse can deteriorate the trade-off between OOD generalization and detection. (3) Contrary to other Bayesian models, we show that the prior defined by DUMs do not have a strong effect on the final performances.
翻译:准确且高效的不确定性估计对于构建可靠的机器学习(ML)模型至关重要,这些模型需能够提供校准后的不确定性估计、具备泛化能力并检测分布外(OOD)数据集。为此,确定性不确定性方法(DUMs)是一类有前景的模型家族,它能够通过单次前向传播完成不确定性估计。本研究探讨了DUMs中的关键设计选择:(1)我们证明,将核心架构与不确定性头部分解的训练方案可显著提升不确定性性能。(2)我们展示,核心架构的表达能力对不确定性性能至关重要,而为避免特征崩溃而附加的架构约束会损害OOD泛化与检测之间的权衡。(3)与其他贝叶斯模型不同,我们发现DUMs所定义的先验对最终性能的影响并不显著。