Recent works show that the data distribution in a network's latent space is useful for estimating classification uncertainty and detecting Out-of-distribution (OOD) samples. To obtain a well-regularized latent space that is conducive for uncertainty estimation, existing methods bring in significant changes to model architectures and training procedures. In this paper, we present a lightweight, fast, and high-performance regularization method for Mahalanobis distance-based uncertainty prediction, and that requires minimal changes to the network's architecture. To derive Gaussian latent representation favourable for Mahalanobis Distance calculation, we introduce a self-supervised representation learning method that separates in-class representations into multiple Gaussians. Classes with non-Gaussian representations are automatically identified and dynamically clustered into multiple new classes that are approximately Gaussian. Evaluation on standard OOD benchmarks shows that our method achieves state-of-the-art results on OOD detection with minimal inference time, and is very competitive on predictive probability calibration. Finally, we show the applicability of our method to a real-life computer vision use case on microorganism classification.
翻译:近期研究表明,网络隐空间中的数据分布有助于评估分类不确定性并检测分布外(OOD)样本。现有方法为获得有利于不确定性估计的良好正则化隐空间,需对模型架构和训练流程进行重大调整。本文提出一种轻量级、快速且高性能的正则化方法,用于基于马氏距离的不确定性预测,且仅需对网络架构进行最小改动。为推导适用于马氏距离计算的高斯隐式表示,我们引入一种自监督表示学习方法,将类内表示分解为多个高斯分布。具有非高斯表示的类别会被自动识别,并动态聚类为近似服从高斯分布的多个新类别。在标准OOD基准测试上的评估表明,本方法在OOD检测中实现了最先进的性能且推理时间极短,同时在预测概率校准方面极具竞争力。最后,我们展示了该方法在微生物分类这一实际计算机视觉用例中的适用性。