"Honey, Tell Me What's Wrong", Global Explanation of Textual Discriminative Models through Cooperative Generation

The ubiquity of complex machine learning has raised the importance of model-agnostic explanation algorithms. These methods create artificial instances by slightly perturbing real instances, capturing shifts in model decisions. However, such methods rely on initial data and only provide explanations of the decision for these. To tackle these problems, we propose Therapy, the first global and model-agnostic explanation method adapted to text which requires no input dataset. Therapy generates texts following the distribution learned by a classifier through cooperative generation. Because it does not rely on initial samples, it allows to generate explanations even when data is absent (e.g., for confidentiality reasons). Moreover, conversely to existing methods that combine multiple local explanations into a global one, Therapy offers a global overview of the model behavior on the input space. Our experiments show that although using no input data to generate samples, Therapy provides insightful information about features used by the classifier that is competitive with the ones from methods relying on input samples and outperforms them when input samples are not specific to the studied model.

翻译：复杂机器学习的普适性提高了模型无关解释算法的重要性。这些方法通过轻微扰动真实实例创建人工实例，捕捉模型决策的偏移。然而，此类方法依赖于初始数据，仅能提供针对这些数据的决策解释。为解决这些问题，我们提出Therapy——首个无需输入数据集、适用于文本的全局且模型无关的解释方法。Therapy通过协同生成，遵循分类器学习到的分布生成文本。由于不依赖初始样本，即使在数据缺失（如因保密原因）的情况下也能生成解释。此外，与将多个局部解释组合为全局解释的现有方法不同，Therapy提供模型在输入空间上行为的全局概览。实验表明，尽管无需输入数据生成样本，Therapy仍能提供关于分类器使用特征的有价值信息，其效果与依赖输入样本的方法相当，并在输入样本不特定于研究模型时优于这些方法。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/