Recently, uncertainty-aware deep learning methods for multiclass labeling problems have been developed that provide calibrated class prediction probabilities and out-of-distribution (OOD) indicators, letting machine learning (ML) consumers and engineers gauge a model's confidence in its predictions. However, this extra neural network prediction information is challenging to scalably convey visually for arbitrary data sources under multiple uncertainty contexts. To address these challenges, we present ScatterUQ, an interactive system that provides targeted visualizations to allow users to better understand model performance in context-driven uncertainty settings. ScatterUQ leverages recent advances in distance-aware neural networks, together with dimensionality reduction techniques, to construct robust, 2-D scatter plots explaining why a model predicts a test example to be (1) in-distribution and of a particular class, (2) in-distribution but unsure of the class, and (3) out-of-distribution. ML consumers and engineers can visually compare the salient features of test samples with training examples through the use of a ``hover callback'' to understand model uncertainty performance and decide follow up courses of action. We demonstrate the effectiveness of ScatterUQ to explain model uncertainty for a multiclass image classification on a distance-aware neural network trained on Fashion-MNIST and tested on Fashion-MNIST (in distribution) and MNIST digits (out of distribution), as well as a deep learning model for a cyber dataset. We quantitatively evaluate dimensionality reduction techniques to optimize our contextually driven UQ visualizations. Our results indicate that the ScatterUQ system should scale to arbitrary, multiclass datasets. Our code is available at https://github.com/mit-ll-responsible-ai/equine-webapp
翻译:近期,针对多类标记问题开发了不确定性感知深度学习方法,这些方法能够提供校准后的类别预测概率和分布外(OOD)指标,使机器学习(ML)用户和工程师能够衡量模型对其预测的置信度。然而,在多种不确定性情境下,这类额外的神经网络预测信息难以针对任意数据源进行可扩展的可视化呈现。为应对这些挑战,我们提出了ScatterUQ——一个交互式系统,通过提供针对性的可视化,使用户能够在情境驱动的确定性设置中更好地理解模型性能。ScatterUQ利用距离感知神经网络的最新进展,结合降维技术,构建稳健的二维散点图,以解释模型为何将测试样本预测为:(1)属于分布内且特定类别,(2)属于分布内但类别不确定,以及(3)属于分布外。ML用户和工程师可通过“悬停回调”功能,直观比较测试样本的显著特征与训练样本,从而理解模型不确定性性能并制定后续行动方案。我们通过在Fashion-MNIST上训练并在Fashion-MNIST(分布内)和MNIST数字(分布外)上测试的距离感知神经网络,以及一个针对网络数据集训练的深度学习模型,验证了ScatterUQ在解释多类图像分类模型不确定性方面的有效性。我们还对降维技术进行了定量评估,以优化情境驱动的不确定性量化(UQ)可视化。结果表明,ScatterUQ系统应能扩展至任意多类数据集。我们的代码已在https://github.com/mit-ll-responsible-ai/equine-webapp 公开。