SIDU-TXT: An XAI Algorithm for NLP with a Holistic Assessment Approach

Explainable AI (XAI) aids in deciphering 'black-box' models. While several methods have been proposed and evaluated primarily in the image domain, the exploration of explainability in the text domain remains a growing research area. In this paper, we delve into the applicability of XAI methods for the text domain. In this context, the 'Similarity Difference and Uniqueness' (SIDU) XAI method, recognized for its superior capability in localizing entire salient regions in image-based classification is extended to textual data. The extended method, SIDU-TXT, utilizes feature activation maps from 'black-box' models to generate heatmaps at a granular, word-based level, thereby providing explanations that highlight contextually significant textual elements crucial for model predictions. Given the absence of a unified standard for assessing XAI methods, this study applies a holistic three-tiered comprehensive evaluation framework: Functionally-Grounded, Human-Grounded and Application-Grounded, to assess the effectiveness of the proposed SIDU-TXT across various experiments. We find that, in sentiment analysis task of a movie review dataset, SIDU-TXT excels in both functionally and human-grounded evaluations, demonstrating superior performance through quantitative and qualitative analyses compared to benchmarks like Grad-CAM and LIME. In the application-grounded evaluation within the sensitive and complex legal domain of asylum decision-making, SIDU-TXT and Grad-CAM demonstrate comparable performances, each with its own set of strengths and weaknesses. However, both methods fall short of entirely fulfilling the sophisticated criteria of expert expectations, highlighting the imperative need for additional research in XAI methods suitable for such domains.

翻译：可解释人工智能（XAI）有助于解析“黑箱”模型。尽管已有多种方法主要在图像领域得到提出与评估，但文本领域的可解释性探索仍是一个日益发展的研究方向。本文深入探讨了XAI方法在文本领域的适用性。在此背景下，将图像分类中具备卓越显著性区域定位能力的“相似性差异与独特性”（SIDU）XAI方法拓展至文本数据。扩展后的方法SIDU-TXT利用“黑箱”模型的特征激活图，生成基于词语粒度的热力图，从而提供突出对模型预测至关重要的上下文相关文本元素的解释。鉴于当前XAI方法缺乏统一的评估标准，本研究应用了功能基础、人类基础和应用基础的三层整体综合评估框架，以评估所提出的SIDU-TXT在不同实验中的有效性。研究发现，在电影评论数据集的情感分析任务中，SIDU-TXT在功能基础和人类基础评估中均表现优异，与Grad-CAM和LIME等基准方法相比，通过定量与定性分析展现出更优性能。在敏感复杂的法律领域（如庇护决策）的应用基础评估中，SIDU-TXT与Grad-CAM表现出相近性能，各有优劣。然而，这两种方法均未能完全满足专家期望的复杂标准，凸显了在适用于此类领域的XAI方法中开展进一步研究的必要性。