Despite the wide use of $k$-Nearest Neighbors as classification models, their explainability properties remain poorly understood from a theoretical perspective. While nearest neighbors classifiers offer interpretability from a "data perspective", in which the classification of an input vector $\bar{x}$ is explained by identifying the vectors $\bar{v}_1, \ldots, \bar{v}_k$ in the training set that determine the classification of $\bar{x}$, we argue that such explanations can be impractical in high-dimensional applications, where each vector has hundreds or thousands of features and it is not clear what their relative importance is. Hence, we focus on understanding nearest neighbor classifications through a "feature perspective", in which the goal is to identify how the values of the features in $\bar{x}$ affect its classification. Concretely, we study abductive explanations such as "minimum sufficient reasons", which correspond to sets of features in $\bar{x}$ that are enough to guarantee its classification, and "counterfactual explanations" based on the minimum distance feature changes one would have to perform in $\bar{x}$ to change its classification. We present a detailed landscape of positive and negative complexity results for counterfactual and abductive explanations, distinguishing between discrete and continuous feature spaces, and considering the impact of the choice of distance function involved. Finally, we show that despite some negative complexity results, Integer Quadratic Programming and SAT solving allow for computing explanations in practice.
翻译:尽管$k$-最近邻作为分类模型被广泛使用,但其可解释性特性在理论层面仍缺乏深入理解。虽然最近邻分类器从"数据视角"提供了可解释性——即通过识别训练集中决定输入向量$\bar{x}$分类的向量$\bar{v}_1, \ldots, \bar{v}_k$来解释分类结果,我们认为这种解释方法在高维应用中可能不实用,因为每个向量具有数百甚至数千个特征,且其特征相对重要性不明确。因此,我们聚焦于通过"特征视角"理解最近邻分类,其目标是识别$\bar{x}$中特征值如何影响其分类结果。具体而言,我们研究两类解释:溯因解释(如"最小充分理由"——对应$\bar{x}$中足以保证其分类结果的特征子集)和基于最小距离特征修改的反事实解释(即需要改变$\bar{x}$中哪些特征值才能改变其分类)。我们系统展示了反事实与溯因解释的复杂度研究全景图,区分了离散与连续特征空间,并考虑了距离函数选择的影响。最后,我们证明尽管存在某些负面的复杂度结论,整数二次规划与SAT求解技术在实际中仍能有效计算这些解释。