At present, there are no easily understood explainable artificial intelligence (AI) methods for discrete token inputs, like text. Most explainable AI techniques do not extend well to token sequences, where both local and global features matter, because state-of-the-art models, like transformers, tend to focus on global connections. Therefore, existing explainable AI algorithms fail by (i) identifying disparate tokens of importance, or (ii) assigning a large number of tokens a low value of importance. This method for explainable AI for tokens-based classifiers generalizes a mask-based explainable AI algorithm for images. It starts with an Explainer neural network that is trained to create masks to hide information not relevant for classification. Then, the Hadamard product of the mask and the continuous values of the classifier's embedding layer is taken and passed through the classifier, changing the magnitude of the embedding vector but keeping the orientation unchanged. The Explainer is trained for a taxonomic classifier for nucleotide sequences and it is shown that the masked segments are less relevant to classification than the unmasked ones. This method focused on the importance the token as a whole (i.e., a segment of the input sequence), producing a human-readable explanation.
翻译:目前,针对离散标记输入(如文本),尚缺乏易于理解的可解释人工智能(AI)方法。大多数可解释AI技术难以有效应用于标记序列,这类序列中局部与全局特征均具重要性,因为Transformer等先进模型往往侧重于全局连接。因此,现有可解释AI算法存在以下缺陷:(i)识别出分散的重要标记,或(ii)将大量标记赋予较低的重要性值。本方法针对基于标记的分类器提出一种可解释AI方案,它是对基于掩码的图像可解释AI算法的推广。首先训练一个解释器神经网络,用于生成掩码以隐藏与分类无关的信息;随后,取该掩码与分类器嵌入层连续值的哈达玛积,并将结果输入分类器,此举可改变嵌入向量的模长但保持其方向不变。针对核苷酸序列的分类分类器训练该解释器,结果表明,被掩码的片段对分类的相关性低于未被掩码的片段。本方法聚焦于标记整体(即输入序列中的某个片段)的重要性,从而生成人类可读的解释。