Currently, there is a significant amount of research being conducted in the field of artificial intelligence to improve the explainability and interpretability of deep learning models. It is found that if end-users understand the reason for the production of some output, it is easier to trust the system. Recommender systems are one example of systems that great efforts have been conducted to make their output more explainable. One method for producing a more explainable output is using counterfactual reasoning, which involves altering minimal features to generate a counterfactual item that results in changing the output of the system. This process allows the identification of input features that have a significant impact on the desired output, leading to effective explanations. In this paper, we present a method for generating counterfactual explanations for both tabular and textual features. We evaluated the performance of our proposed method on three real-world datasets and demonstrated a +5\% improvement on finding effective features (based on model-based measures) compared to the baseline method.
翻译:当前,人工智能领域有大量研究致力于提升深度学习模型的可解释性与可读性。研究表明,如果终端用户理解某些输出的产生原因,便更容易信任系统。推荐系统是其中一个例子,大量努力被投入以使其输出更具可解释性。生成更具可解释性输出的一种方法是使用反事实推理,该方法通过修改最少数量的特征来生成一个反事实项目,从而改变系统的输出。这一过程能够识别对期望输出有显著影响的输入特征,进而产生有效的解释。本文提出了一种针对表格特征和文本特征生成反事实解释的方法。我们在三个真实世界数据集上评估了所提方法的性能,结果表明,与基线方法相比,在基于模型度量的有效特征发现方面,性能提升了+5%。