Recently, uncertainty-aware deep learning methods for multiclass labeling problems have been developed that provide calibrated class prediction probabilities and out-of-distribution (OOD) indicators, letting machine learning (ML) consumers and engineers gauge a model's confidence in its predictions. However, this extra neural network prediction information is challenging to scalably convey visually for arbitrary data sources under multiple uncertainty contexts. To address these challenges, we present ScatterUQ, an interactive system that provides targeted visualizations to allow users to better understand model performance in context-driven uncertainty settings. ScatterUQ leverages recent advances in distance-aware neural networks, together with dimensionality reduction techniques, to construct robust, 2-D scatter plots explaining why a model predicts a test example to be (1) in-distribution and of a particular class, (2) in-distribution but unsure of the class, and (3) out-of-distribution. ML consumers and engineers can visually compare the salient features of test samples with training examples through the use of a ``hover callback'' to understand model uncertainty performance and decide follow up courses of action. We demonstrate the effectiveness of ScatterUQ to explain model uncertainty for a multiclass image classification on a distance-aware neural network trained on Fashion-MNIST and tested on Fashion-MNIST (in distribution) and MNIST digits (out of distribution), as well as a deep learning model for a cyber dataset. We quantitatively evaluate dimensionality reduction techniques to optimize our contextually driven UQ visualizations. Our results indicate that the ScatterUQ system should scale to arbitrary, multiclass datasets. Our code is available at https://github.com/mit-ll-responsible-ai/equine-webapp
翻译:近期,针对多类标注问题的不确定性感知深度学习方法已被开发出来,这些方法能够提供校准的类别预测概率和分布外(OOD)指示器,使机器学习(ML)用户和工程师能够衡量模型对其预测的置信度。然而,在多种不确定性情境下,这种额外的神经网络预测信息难以针对任意数据源进行可扩展的可视化传达。为应对这些挑战,我们提出了ScatterUQ,一个交互式系统,它提供有针对性的可视化,帮助用户在情境驱动的不确定性设置中更好地理解模型性能。ScatterUQ利用距离感知神经网络的最新进展,结合降维技术,构建稳健的二维散点图,解释模型为何将测试样本预测为(1)分布内且属于特定类别、(2)分布内但类别不确定,以及(3)分布外。ML用户和工程师可以通过使用“悬停回调”功能,直观比较测试样本与训练样本的显著特征,以理解模型的不确定性性能并决定后续行动方案。我们通过一个在Fashion-MNIST上训练并在Fashion-MNIST(分布内)和MNIST手写数字(分布外)上测试的距离感知神经网络多类图像分类任务,以及一个用于网络数据集的深度学习模型,展示了ScatterUQ解释模型不确定性的有效性。我们定量评估了降维技术,以优化情境驱动的不确定性量化(UQ)可视化。结果表明,ScatterUQ系统可扩展至任意多类数据集。我们的代码可在https://github.com/mit-ll-responsible-ai/equine-webapp获取。