Topological Interpretability for Deep-Learning

With the increasing adoption of AI-based systems across everyday life, the need to understand their decision-making mechanisms is correspondingly accelerating. The level at which we can trust the statistical inferences made from AI-based decision systems is an increasing concern, especially in high-risk systems such as criminal justice or medical diagnosis, where incorrect inferences may have tragic consequences. Despite their successes in providing solutions to problems involving real-world data, deep learning (DL) models cannot quantify the certainty of their predictions. And are frequently quite confident, even when their solutions are incorrect. This work presents a method to infer prominent features in two DL classification models trained on clinical and non-clinical text by employing techniques from topological and geometric data analysis. We create a graph of a model's prediction space and cluster the inputs into the graph's vertices by the similarity of features and prediction statistics. We then extract subgraphs demonstrating high-predictive accuracy for a given label. These subgraphs contain a wealth of information about features that the DL model has recognized as relevant to its decisions. We infer these features for a given label using a distance metric between probability measures, and demonstrate the stability of our method compared to the LIME interpretability method. This work demonstrates that we may gain insights into the decision mechanism of a DL model, which allows us to ascertain if the model is making its decisions based on information germane to the problem or identifies extraneous patterns within the data.

翻译：随着基于人工智能的系统在日常生活中的日益普及，理解其决策机制的需求也相应加速。我们对于基于人工智能的决策系统进行统计推断的信任程度日益受到关注，尤其是在刑事司法或医疗诊断等高危系统中，错误的推断可能带来灾难性后果。尽管深度学习模型在解决涉及真实世界数据的问题方面取得了成功，但它们无法量化其预测的确定性，并且即使解决方案是错误的，也常常非常自信。本研究提出了一种方法，通过运用拓扑和几何数据分析技术，推断在临床和非临床文本上训练的两个深度学习分类模型中的显著特征。我们构建了模型预测空间的图，并根据特征和预测统计的相似性将输入聚类到图的顶点中。然后，我们提取出对给定标签具有高预测准确性的子图。这些子图包含了关于深度学习模型认为与其决策相关的特征的大量信息。我们利用概率测度之间的距离度量来推断给定标签的这些特征，并展示了我们的方法相较于LIME可解释性方法的稳定性。这项研究表明，我们可以深入了解深度学习模型的决策机制，从而确定模型是基于与问题相关的信息做出决策，还是识别出了数据中的外部模式。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

因果图，Causal Graphs，52页ppt

专知会员服务

254+阅读 · 2020年4月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日