Recent works show that the data distribution in a network's latent space is useful for estimating classification uncertainty and detecting Out-of-distribution (OOD) samples. To obtain a well-regularized latent space that is conducive for uncertainty estimation, existing methods bring in significant changes to model architectures and training procedures. In this paper, we present a lightweight, fast, and high-performance regularization method for Mahalanobis distance-based uncertainty prediction, and that requires minimal changes to the network's architecture. To derive Gaussian latent representation favourable for Mahalanobis Distance calculation, we introduce a self-supervised representation learning method that separates in-class representations into multiple Gaussians. Classes with non-Gaussian representations are automatically identified and dynamically clustered into multiple new classes that are approximately Gaussian. Evaluation on standard OOD benchmarks shows that our method achieves state-of-the-art results on OOD detection with minimal inference time, and is very competitive on predictive probability calibration. Finally, we show the applicability of our method to a real-life computer vision use case on microorganism classification.
翻译:近期研究表明,网络潜在空间中的数据分布对于评估分类不确定性和检测分布外样本具有重要价值。为获得利于不确定性估计的规整潜在空间,现有方法需对模型架构与训练流程进行重大修改。本文提出一种轻量、快速且高性能的正则化方法,用于基于马氏距离的不确定性预测,且仅需对网络架构进行最小改动。为推导适用于马氏距离计算的高斯潜在表征,我们引入一种自监督表征学习方法,将同类表征分离为多个高斯分布。非高斯表征的类别会被自动识别,并动态聚类为近似服从高斯分布的新类别。在标准分布外检测基准上的评估表明,本方法在分布外检测中达到最优性能且推理时间极短,同时在预测概率校准方面极具竞争力。最后,我们展示了该方法在微生物分类这一真实计算机视觉应用中的有效性。