Feature attribution is a fundamental task in both machine learning and data analysis, which involves determining the contribution of individual features or variables to a model's output. This process helps identify the most important features for predicting an outcome. The history of feature attribution methods can be traced back to General Additive Models (GAMs), which extend linear regression models by incorporating non-linear relationships between dependent and independent variables. In recent years, gradient-based methods and surrogate models have been applied to unravel complex Artificial Intelligence (AI) systems, but these methods have limitations. GAMs tend to achieve lower accuracy, gradient-based methods can be difficult to interpret, and surrogate models often suffer from stability and fidelity issues. Furthermore, most existing methods do not consider users' contexts, which can significantly influence their preferences. To address these limitations and advance the current state-of-the-art, we define a novel feature attribution framework called Context-Aware Feature Attribution Through Argumentation (CA-FATA). Our framework harnesses the power of argumentation by treating each feature as an argument that can either support, attack or neutralize a prediction. Additionally, CA-FATA formulates feature attribution as an argumentation procedure, and each computation has explicit semantics, which makes it inherently interpretable. CA-FATA also easily integrates side information, such as users' contexts, resulting in more accurate predictions.
翻译:特征归因是机器学习和数据分析中的基础任务,旨在确定单个特征或变量对模型输出的贡献。该过程有助于识别预测结果的最重要特征。特征归因方法的历史可追溯至广义加性模型(GAMs),该类模型通过引入因变量与自变量之间的非线性关系来扩展线性回归模型。近年来,基于梯度的方法和替代模型被用于解析复杂的人工智能(AI)系统,但这些方法存在局限性:GAMs往往准确率较低,基于梯度的方法难以解释,而替代模型常面临稳定性与保真度问题。此外,现有方法大多未考虑用户上下文,而上下文会显著影响用户偏好。为突破上述局限并推动当前技术发展,我们提出了一种新型特征归因框架——基于论辩的上下文感知特征归因(CA-FATA)。该框架通过将每个特征视为可支持、反驳或中立预测的论证,充分利用了论辩能力。同时,CA-FATA将特征归因形式化为论辩过程,每次计算具有显式语义,因此天然具备可解释性。CA-FATA还能轻松整合用户上下文等辅助信息,从而实现更准确的预测。