In light of the growing popularity of Exploratory Data Analysis (EDA), understanding the underlying causes of the knowledge acquired by EDA is crucial. However, it remains under-researched. This study promotes a transparent and explicable perspective on data analysis, called eXplainable Data Analysis (XDA). For this reason, we present XInsight, a general framework for XDA. XInsight provides data analysis with qualitative and quantitative explanations of causal and non-causal semantics. This way, it will significantly improve human understanding and confidence in the outcomes of data analysis, facilitating accurate data interpretation and decision making in the real world. XInsight is a three-module, end-to-end pipeline designed to extract causal graphs, translate causal primitives into XDA semantics, and quantify the quantitative contribution of each explanation to a data fact. XInsight uses a set of design concepts and optimizations to address the inherent difficulties associated with integrating causality into XDA. Experiments on synthetic and real-world datasets as well as a user study demonstrate the highly promising capabilities of XInsight.
翻译:随着探索性数据分析(EDA)的日益普及,理解EDA所获取知识的根本原因至关重要,然而相关研究仍显不足。本研究提出了一种透明且可解释的数据分析视角——可解释数据分析(XDA)。为此,我们提出了XInsight,一个通用的XDA框架。XInsight为数据分析提供了关于因果与非因果语义的定性和定量解释。该方法将显著提升人类对数据分析结果的理解与信心,促进现实世界中的数据准确解读与决策制定。XInsight是一个由三个模块组成的端到端流水线,旨在提取因果图、将因果原语转化为XDA语义,并量化每种解释对数据事实的定量贡献。XInsight采用一系列设计理念和优化策略,以解决将因果性融入XDA时固有的难题。在合成数据集和真实数据集上的实验及用户研究表明,XInsight展示了极具潜力的能力。