Interpretability and explainability are among the most important challenges of modern artificial intelligence, being mentioned even in various legislative sources. In this article, we develop a method for extracting immediately human interpretable classifiers from tabular data. The classifiers are given in the form of short Boolean formulas built with propositions that can either be directly extracted from categorical attributes or dynamically computed from numeric ones. Our method is implemented using Answer Set Programming. We investigate seven datasets and compare our results to ones obtainable by state-of-the-art classifiers for tabular data, namely, XGBoost and random forests. Over all datasets, the accuracies obtainable by our method are similar to the reference methods. The advantage of our classifiers in all cases is that they are very short and immediately human intelligible as opposed to the black-box nature of the reference methods.
翻译:可解释性与可解释性分析是现代人工智能领域最为关键的挑战之一,甚至在各类立法文献中亦被提及。本文提出一种从表格数据中直接提取人类可理解分类器的方法。该分类器以简短布尔公式形式呈现,其命题可直接从分类属性中提取,或通过数值属性动态计算生成。本方法采用答案集编程实现。我们通过对七个数据集的实验研究,将所得结果与当前最先进的表格数据分类器(即XGBoost与随机森林)进行对比。在所有数据集上,本方法所获准确率与参照方法相当。本分类器在所有案例中的优势在于:相较于参照方法的黑箱特性,其生成的分类规则极为简洁且具备即时人类可理解性。