Most AI systems are black boxes generating reasonable outputs for given inputs. Some domains, however, have explainability and trustworthiness requirements that cannot be directly met by these approaches. Various methods have therefore been developed to interpret black-box models after training. This paper advocates an alternative approach where the models are transparent and explainable to begin with. This approach, EVOTER, evolves rule-sets based on simple logical expressions. The approach is evaluated in several prediction/classification and prescription/policy search domains with and without a surrogate. It is shown to discover meaningful rule sets that perform similarly to black-box models. The rules can provide insight into the domain, and make biases hidden in the data explicit. It may also be possible to edit them directly to remove biases and add constraints. EVOTER thus forms a promising foundation for building trustworthy AI systems for real-world applications in the future.
翻译:摘要:大多数AI系统是黑箱模型,能针对给定输入生成合理输出。然而,某些领域对可解释性和可信赖性有要求,这些方法无法直接满足。为此,研究者开发了多种方法在训练后解释黑箱模型。本文倡导另一种方法,即模型本身从一开始就是透明且可解释的。该方法名为EVOTER,基于简单逻辑表达式演化规则集。我们在多个预测/分类和处方/策略搜索领域中(无论是否使用代理模型)对该方法进行了评估。结果表明,EVOTER能够发现有意义的规则集,其性能与黑箱模型相当。这些规则可提供对领域的洞察,并揭示数据中隐藏的偏差。此外,还可以直接编辑规则以消除偏差并添加约束。因此,EVOTER为未来在真实世界应用中构建可信赖AI系统奠定了有前景的基础。