Currently, there is a significant amount of research being conducted in the field of artificial intelligence to improve the explainability and interpretability of deep learning models. It is found that if end-users understand the reason for the production of some output, it is easier to trust the system. Recommender systems are one example of systems that great efforts have been conducted to make their output more explainable. One method for producing a more explainable output is using counterfactual reasoning, which involves altering minimal features to generate a counterfactual item that results in changing the output of the system. This process allows the identification of input features that have a significant impact on the desired output, leading to effective explanations. In this paper, we present a method for generating counterfactual explanations for both tabular and textual features. We evaluated the performance of our proposed method on three real-world datasets and demonstrated a +5\% improvement on finding effective features (based on model-based measures) compared to the baseline method.
翻译:当前,人工智能领域正大量开展关于提升深度学习模型可解释性与可理解性的研究。研究表明,若最终用户能够理解系统输出结果的产生原因,则更容易建立对系统的信任。推荐系统即为该类投入大量研究以增强输出可解释性的典型系统之一。生成更具可解释性输出的方法之一是采用反事实推理,该方法通过修改最小特征集生成反事实项目,从而改变系统输出结果。该过程能够识别对目标输出具有显著影响的输入特征,进而形成有效解释。本文提出了一种面向表格特征与文本特征生成反事实解释的方法。我们在三个真实数据集上对所提方法进行了性能评估,实验结果表明,与基线方法相比,本方法在有效特征识别(基于模型度量指标)方面实现了+5%的性能提升。