Low-order functional ANOVA (fANOVA) models have been rediscovered in the machine learning (ML) community under the guise of inherently interpretable machine learning. Explainable Boosting Machines or EBM (Lou et al. 2013) and GAMI-Net (Yang et al. 2021) are two recently proposed ML algorithms for fitting functional main effects and second-order interactions. We propose a new algorithm, called GAMI-Tree, that is similar to EBM, but has a number of features that lead to better performance. It uses model-based trees as base learners and incorporates a new interaction filtering method that is better at capturing the underlying interactions. In addition, our iterative training method converges to a model with better predictive performance, and the embedded purification ensures that interactions are hierarchically orthogonal to main effects. The algorithm does not need extensive tuning, and our implementation is fast and efficient. We use simulated and real datasets to compare the performance and interpretability of GAMI-Tree with EBM and GAMI-Net.
翻译:低阶函数型ANOVA(fANOVA)模型在机器学习领域以可解释机器学习的面貌被重新发现。可解释提升机(EBM, Lou et al. 2013)与GAMI-Net(Yang et al. 2021)是近期提出的两种用于拟合函数主效应与二阶交互效应的机器学习算法。我们提出一种名为GAMI-Tree的新算法,该算法虽与EBM相似,但具备多项能提升性能的特性。它采用基于模型的树作为基学习器,并融入一种更优的交互过滤方法以有效捕捉底层交互效应。此外,我们的迭代训练方法能收敛至预测性能更优的模型,内置的纯化机制确保交互效应与主效应在层级上保持正交。该算法无需繁琐调参,且实现快速高效。我们通过模拟数据集与真实数据集,对比了GAMI-Tree与EBM及GAMI-Net在性能与可解释性方面的表现。