The development of machine learning applications has increased significantly in recent years, motivated by the remarkable ability of learning-powered systems to discover and generalize intricate patterns hidden in massive datasets. Modern learning models, while powerful, often exhibit a level of complexity that renders them opaque black boxes, resulting in a notable lack of transparency that hinders our ability to decipher their decision-making processes. Opacity challenges the interpretability and practical application of machine learning, especially in critical domains where understanding the underlying reasons is essential for informed decision-making. Explainable Artificial Intelligence (XAI) rises to meet that challenge, unraveling the complexity of black boxes by providing elucidating explanations. Among the various XAI approaches, feature attribution/importance XAI stands out for its capacity to delineate the significance of input features in the prediction process. However, most existing attribution methods have limitations, such as instability, when divergent explanations may result from similar or even the same instance. In this work, we introduce T-Explainer, a novel local additive attribution explainer based on Taylor expansion endowed with desirable properties, such as local accuracy and consistency, while stable over multiple runs. We demonstrate T-Explainer's effectiveness through benchmark experiments with well-known attribution methods. In addition, T-Explainer is developed as a comprehensive XAI framework comprising quantitative metrics to assess and visualize attribution explanations.
翻译:近年来,机器学习应用显著增长,这得益于学习驱动系统能够发现并泛化海量数据中隐藏的复杂模式。现代学习模型虽功能强大,但其复杂性往往使其成为不透明的黑箱,导致透明度严重不足,阻碍了我们解读其决策过程的能力。这种不透明性挑战了机器学习的可解释性与实际应用,尤其在关键领域中,理解潜在原因对于做出明智决策至关重要。可解释人工智能(XAI)应运而生,通过提供清晰的解释,揭示黑箱的复杂性。在众多XAI方法中,特征归因/重要性XAI因其能够描绘输入特征在预测过程中的重要性而脱颖而出。然而,现有大多数归因方法存在局限性,例如不稳定性——相似甚至相同实例可能产生分歧性解释。在本工作中,我们提出T-Explainer——一种基于泰勒展开的新型局部加性归因解释器,它具备局部准确性与一致性等理想特性,且在多次运行中保持稳定。我们通过基准实验与知名归因方法对比,验证了T-Explainer的有效性。此外,T-Explainer被开发为一个综合型XAI框架,包含定量指标以评估和可视化归因解释。