Tree Ensemble (TE) models, such as Gradient Boosted Trees, often achieve optimal performance on tabular datasets, yet their lack of transparency poses challenges for comprehending their decision logic. This paper introduces TE2Rules (Tree Ensemble to Rules), a novel approach for explaining binary classification tree ensemble models through a list of rules, particularly focusing on explaining the minority class. Many state-of-the-art explainers struggle with minority class explanations, making TE2Rules valuable in such cases. The rules generated by TE2Rules closely approximate the original model, ensuring high fidelity, providing an accurate and interpretable means to understand decision-making. Experimental results demonstrate that TE2Rules scales effectively to tree ensembles with hundreds of trees, achieving higher fidelity within runtimes comparable to baselines. TE2Rules allows for a trade-off between runtime and fidelity, enhancing its practical applicability. The implementation is available here: https://github.com/linkedin/TE2Rules.
翻译:树集成模型(如梯度提升树)在表格数据集上常能达到最优性能,但其缺乏透明度给理解决策逻辑带来挑战。本文提出TE2Rules(树集成到规则)方法,这是一种通过规则列表解释二分类树集成模型的新方法,特别聚焦于对少数类别的解释。许多现有解释器在处理少数类解释时效果不佳,这使得TE2Rules在此类场景中具有重要价值。TE2Rules生成的规则紧密逼近原始模型,确保了高保真度,为理解决策机制提供了准确且可解释的手段。实验结果表明,TE2Rules能有效扩展到包含数百棵树的树集成模型,在与基线方法相当的运行时间内实现更高保真度。TE2Rules允许在运行时间和保真度之间进行权衡,增强了其实用性。该方法实现详见:https://github.com/linkedin/TE2Rules。