Most AI systems are black boxes generating reasonable outputs for given inputs. Some domains, however, have explainability and trustworthiness requirements that cannot be directly met by these approaches. Various methods have therefore been developed to interpret black-box models after training. This paper advocates an alternative approach where the models are transparent and explainable to begin with. This approach, EVOTER, evolves rule-sets based on simple logical expressions. The approach is evaluated in several prediction/classification and prescription/policy search domains with and without a surrogate. It is shown to discover meaningful rule sets that perform similarly to black-box models. The rules can provide insight into the domain, and make biases hidden in the data explicit. It may also be possible to edit them directly to remove biases and add constraints. EVOTER thus forms a promising foundation for building trustworthy AI systems for real-world applications in the future.
翻译:大多数人工智能系统都是黑箱模型,能针对给定输入产生合理输出。然而,某些领域对可解释性和可信度有要求,这些方法无法直接满足。为此,研究人员开发了多种方法在训练后对黑箱模型进行解释。本文倡导另一种方法,即模型从一开始就具有透明性和可解释性。该方法名为EVOTER,基于简单逻辑表达式进化出规则集。该方法在有/无替代模型的情况下,在多个预测/分类和处方/策略搜索领域进行了评估。实验表明,该方法能发现与黑箱模型性能相似的有意义规则集。这些规则可揭示领域内在规律,并使隐藏在数据中的偏差显式化。此外,还可直接编辑规则以消除偏差并添加约束。因此,EVOTER为未来构建面向现实世界的可信赖人工智能系统奠定了坚实基础。