T-Explainer: A Model-Agnostic Explainability Framework Based on Gradients

The development of machine learning applications has increased significantly in recent years, motivated by the remarkable ability of learning-powered systems to discover and generalize intricate patterns hidden in massive datasets. Modern learning models, while powerful, often have a level of complexity that renders them opaque black boxes, resulting in a notable lack of transparency that hinders our ability to decipher their reasoning. Opacity challenges the interpretability and practical application of machine learning, especially in critical domains where understanding the underlying reasons is essential for informed decision-making. Explainable Artificial Intelligence (XAI) rises to address that challenge, unraveling the complexity of black boxes by providing elucidating explanations. Among the various XAI approaches, feature attribution/importance stands out for its capacity to delineate the significance of input features in the prediction process. However, most existing attribution methods have limitations, such as instability, when divergent explanations may result from similar or even the same instance. This work introduces T-Explainer, a novel local additive attribution explainer based on Taylor expansion. It has desirable properties, such as local accuracy and consistency, making T-Explainer stable over multiple runs. We demonstrate T-Explainer's effectiveness in quantitative benchmark experiments against well-known attribution methods. Additionally, we provide several tools to evaluate and visualize explanations, turning T-Explainer into a comprehensive XAI framework.

翻译：近年来，机器学习应用的发展显著加快，这得益于学习驱动系统在发现并泛化海量数据集中隐藏的复杂模式方面所展现出的卓越能力。现代学习模型虽然强大，但其复杂性往往使其成为不透明的黑箱，导致显著缺乏透明度，从而阻碍了我们解读其推理过程的能力。这种不透明性对机器学习的可解释性及实际应用构成了挑战，尤其是在关键领域中，理解其底层原因对于做出明智决策至关重要。可解释人工智能（XAI）应运而生，旨在通过提供清晰的解释来揭示黑箱的复杂性。在各种XAI方法中，特征归因/重要性方法因其能够刻画输入特征在预测过程中的重要性而脱颖而出。然而，大多数现有的归因方法存在局限性，例如不稳定性——相似甚至相同的实例可能产生截然不同的解释。本文提出了T-Explainer，一种基于泰勒展开的新型局部加性归因解释器。它具有局部准确性和一致性等理想特性，使得T-Explainer在多次运行中保持稳定。我们通过针对知名归因方法的定量基准实验，证明了T-Explainer的有效性。此外，我们还提供了多种用于评估和可视化解释的工具，将T-Explainer转化为一个全面的XAI框架。