On Formal Feature Attribution and Its Approximation

Recent years have witnessed the widespread use of artificial intelligence (AI) algorithms and machine learning (ML) models. Despite their tremendous success, a number of vital problems like ML model brittleness, their fairness, and the lack of interpretability warrant the need for the active developments in explainable artificial intelligence (XAI) and formal ML model verification. The two major lines of work in XAI include feature selection methods, e.g. Anchors, and feature attribution techniques, e.g. LIME and SHAP. Despite their promise, most of the existing feature selection and attribution approaches are susceptible to a range of critical issues, including explanation unsoundness and out-of-distribution sampling. A recent formal approach to XAI (FXAI) although serving as an alternative to the above and free of these issues suffers from a few other limitations. For instance and besides the scalability limitation, the formal approach is unable to tackle the feature attribution problem. Additionally, a formal explanation despite being formally sound is typically quite large, which hampers its applicability in practical settings. Motivated by the above, this paper proposes a way to apply the apparatus of formal XAI to the case of feature attribution based on formal explanation enumeration. Formal feature attribution (FFA) is argued to be advantageous over the existing methods, both formal and non-formal. Given the practical complexity of the problem, the paper then proposes an efficient technique for approximating exact FFA. Finally, it offers experimental evidence of the effectiveness of the proposed approximate FFA in comparison to the existing feature attribution algorithms not only in terms of feature importance and but also in terms of their relative order.

翻译：近年来，人工智能算法与机器学习模型得到了广泛应用。尽管取得了巨大成功，但机器学习模型的脆弱性、公平性问题以及缺乏可解释性等关键问题，亟需在可解释人工智能（XAI）和形式化机器学习模型验证领域开展积极研究。XAI的两大主要研究方向包括特征选择方法（如Anchors）和特征归因技术（如LIME和SHAP）。尽管这些方法前景广阔，但现有的大多数特征选择与归因方法都存在一系列关键问题，包括解释不完整和分布外采样问题。近期提出的形式化XAI方法（FXAI）虽可作为上述方法的替代方案且不受这些问题影响，但其本身也存在若干局限性。例如，除了可扩展性限制外，形式化方法无法解决特征归因问题。此外，形式化解释虽然在逻辑上完整，但通常规模过大，这限制了其在实际场景中的适用性。基于上述问题，本文提出了一种将形式化XAI方法应用于基于形式化解释枚举的特征归因问题的方案。形式化特征归因（FFA）被论证优于现有方法（包括形式化与非形式化方法）。针对该问题的实际复杂性，本文随后提出了一种高效近似精确FFA的技术。最后，通过实验证据表明，本文提出的近似FFA方法相较于现有特征归因算法，不仅在特征重要性方面表现出色，在特征的相对排序方面也具有显著优势。