Recent works show that the data distribution in a network's latent space is useful for estimating classification uncertainty and detecting Out-of-distribution (OOD) samples. To obtain a well-regularized latent space that is conducive for uncertainty estimation, existing methods bring in significant changes to model architectures and training procedures. In this paper, we present a lightweight, fast, and high-performance regularization method for Mahalanobis distance-based uncertainty prediction, and that requires minimal changes to the network's architecture. To derive Gaussian latent representation favourable for Mahalanobis Distance calculation, we introduce a self-supervised representation learning method that separates in-class representations into multiple Gaussians. Classes with non-Gaussian representations are automatically identified and dynamically clustered into multiple new classes that are approximately Gaussian. Evaluation on standard OOD benchmarks shows that our method achieves state-of-the-art results on OOD detection with minimal inference time, and is very competitive on predictive probability calibration. Finally, we show the applicability of our method to a real-life computer vision use case on microorganism classification.
翻译:近期研究表明,网络潜在空间中的数据分布对估计分类不确定性及检测分布外样本具有重要价值。为获得利于不确定性估计的规范潜空间,现有方法需对模型架构与训练流程进行显著改造。本文提出一种轻量级、高速且高性能的正则化方法,用于基于马氏距离的不确定性预测,且仅需对网络架构作极小改动。为生成有利于马氏距离计算的高斯潜表示,我们引入自监督表示学习方法,将类内表示分解为多个高斯分布。非高斯分布的类别会被自动识别并动态聚类为近似高斯分布的新类别。在标准分布外检测基准上的评估表明,本方法在最小化推理时间的前提下取得分布外检测最优结果,且在预测概率校准方面极具竞争力。最后,我们展示了该方法在微生物分类实际计算机视觉应用中的可行性。