Learning systems based on IF-THEN rule representations readily offer interpretability, making them a crucial focus in contemporary AI research. A key objective for such rule sets is to achieve both high discriminative power and interpretability. While existing state-of-the-art algorithms implicitly prioritize predictive accuracy, they often fall short on one or more quality metrics that ensure interpretability, such as coverage and parsimony of rule sets. Motivated by this, this paper propose the development of CDPR, which aims to create highly accurate and interpretable rule sets for classification problems. To the best of our knowledge, this represents the first attempt to establish such an approach. In this study, we introduce two algorithms rooted in submodular maximization, which not only provide provable guarantees on coverage but also yield rule sets that are both discriminative and parsimonious. We empirically demonstrate that rule sets learned through our approaches achieve higher accuracy and interpretability and has more than a 2.5-fold improvement in average coverage rates when compared to the next best algorithm.
翻译:基于IF-THEN规则表示的学习系统易于提供可解释性,使其成为当代人工智能研究的关键焦点。此类规则集的一个核心目标是同时实现高判别能力和可解释性。虽然现有最先进算法隐式地优先考虑预测准确性,但它们通常在一个或多个确保可解释性的质量指标上表现不足,例如规则集的覆盖率和简约性。受此启发,本文提出开发CDPR,旨在为分类问题创建高精度且可解释的规则集。据我们所知,这是建立此类方法的首次尝试。在本研究中,我们引入了两种基于子模最大化的算法,这些算法不仅提供可证明的覆盖率保证,而且还能产生兼具判别性和简约性的规则集。我们通过实验证明,通过我们的方法学习到的规则集实现了更高的准确性和可解释性,并且与次优算法相比,平均覆盖率提高了2.5倍以上。