Random Forests and related tree-based methods are popular for supervised learning from table based data. Apart from their ease of parallelization, their classification performance is also superior. However, this performance, especially parallelizability, is offset by the loss of explainability. Statistical methods are often used to compensate for this disadvantage. Yet, their ability for local explanations, and in particular for global explanations, is limited. In the present work we propose an algebraic method, rooted in lattice theory, for the (global) explanation of tree ensembles. In detail, we introduce two novel conceptual views on tree ensemble classifiers and demonstrate their explanatory capabilities on Random Forests that were trained with standard parameters.
翻译:随机森林及相关基于树的方法因其在表格数据监督学习中的广泛应用而备受青睐。除了易于并行化外,其分类性能也表现卓越。然而,这一优势(尤其是并行化能力)却以可解释性的降低为代价。统计方法常被用于弥补这一缺陷,但其在局部解释(尤其是全局解释)方面的能力有限。本文提出一种基于格论的代数方法,用于树集成模型的(全局)解释。具体而言,我们引入两种关于树集成分类器的创新概念性视角,并在采用标准参数训练的随机森林上验证了其解释能力。