Given a machine learning (ML) model and a prediction, explanations can be defined as sets of features which are sufficient for the prediction. In some applications, and besides asking for an explanation, it is also critical to understand whether sensitive features can occur in some explanation, or whether a non-interesting feature must occur in all explanations. This paper starts by relating such queries respectively with the problems of relevancy and necessity in logic-based abduction. The paper then proves membership and hardness results for several families of ML classifiers. Afterwards the paper proposes concrete algorithms for two classes of classifiers. The experimental results confirm the scalability of the proposed algorithms.
翻译:给定一个机器学习(ML)模型及其预测,解释可定义为对预测足够的特征集合。在某些应用中,除了要求解释外,理解敏感特征是否可能出现在某些解释中,或非目标特征是否必然出现在所有解释中也至关重要。本文首先将这些查询分别与基于逻辑溯因中的相关性与必要性问题进行关联。随后,本文证明了多类ML分类器的成员资格与困难性结果。接着,针对两类分类器提出了具体算法。实验结果验证了所提算法的可扩展性。