Low-order functional ANOVA (fANOVA) models have been rediscovered in the machine learning (ML) community under the guise of inherently interpretable machine learning. Explainable Boosting Machines or EBM (Lou et al. 2013) and GAMI-Net (Yang et al. 2021) are two recently proposed ML algorithms for fitting functional main effects and second-order interactions. We propose a new algorithm, called GAMI-Tree, that is similar to EBM, but has a number of features that lead to better performance. It uses model-based trees as base learners and incorporates a new interaction filtering method that is better at capturing the underlying interactions. In addition, our iterative training method converges to a model with better predictive performance, and the embedded purification ensures that interactions are hierarchically orthogonal to main effects. The algorithm does not need extensive tuning, and our implementation is fast and efficient. We use simulated and real datasets to compare the performance and interpretability of GAMI-Tree with EBM and GAMI-Net.
翻译:低阶函数ANOVA(fANOVA)模型已在机器学习界以本质可解释机器学习的名义被重新发现。可解释提升机(EBM,Lou等人,2013年)和GAMI-Net(Yang等人,2021年)是最近提出的两种用于拟合函数主效应和二阶交互作用的机器学习算法。我们提出了一种新算法,称为GAMI-Tree,该算法与EBM类似,但具有若干可带来更优性能的特性。它采用基于模型的树作为基学习器,并融入了一种新型交互过滤方法,能更有效地捕捉底层交互作用。此外,我们的迭代训练方法能收敛至预测性能更佳的模型,且内置的纯化机制确保交互作用在层级上与主效应正交。该算法无需大量调参,实现高效快速。我们通过模拟数据集和真实数据集,将GAMI-Tree与EBM及GAMI-Net的性能与可解释性进行了比较。