In today's machine learning world for tabular data, XGBoost and fully connected neural network (FCNN) are two most popular methods due to their good model performance and convenience to use. However, they are highly complicated, hard to interpret, and can be overfitted. In this paper, we propose a new modeling framework called cross spline net (CSN) that is based on a combination of spline transformation and cross-network (Wang et al. 2017, 2021). We will show CSN is as performant and convenient to use, and is less complicated, more interpretable and robust. Moreover, the CSN framework is flexible, as the spline layer can be configured differently to yield different models. With different choices of the spline layer, we can reproduce or approximate a set of non-neural network models, including linear and spline-based statistical models, tree, rule-fit, tree-ensembles (gradient boosting trees, random forest), oblique tree/forests, multi-variate adaptive regression spline (MARS), SVM with polynomial kernel, etc. Therefore, CSN provides a unified modeling framework that puts the above set of non-neural network models under the same neural network framework. By using scalable and powerful gradient descent algorithms available in neural network libraries, CSN avoids some pitfalls (such as being ad-hoc, greedy or non-scalable) in the case-specific optimization methods used in the above non-neural network models. We will use a special type of CSN, TreeNet, to illustrate our point. We will compare TreeNet with XGBoost and FCNN to show the benefits of TreeNet. We believe CSN will provide a flexible and convenient framework for practitioners to build performant, robust and more interpretable models.
翻译:在当今面向表格数据的机器学习领域,XGBoost和全连接神经网络因其优异的模型性能和使用便捷性成为两种最主流的方法。然而,它们结构高度复杂、可解释性差且容易过拟合。本文提出了一种称为交叉样条网络的新型建模框架,该框架基于样条变换与交叉网络的组合。我们将证明CSN在保持高性能与使用便捷性的同时,具有结构更简洁、可解释性更强且更稳健的特点。此外,CSN框架具有高度灵活性,通过不同配置的样条层可衍生出多种模型变体。通过选择不同的样条层配置,我们能够复现或逼近一系列非神经网络模型,包括线性与基于样条的统计模型、决策树、规则拟合模型、树集成模型(梯度提升树、随机森林)、斜决策树/森林、多元自适应回归样条、多项式核支持向量机等。因此,CSN构建了一个统一的建模框架,将上述非神经网络模型纳入同一神经网络体系。借助神经网络库中可扩展且强大的梯度下降算法,CSN避免了上述非神经网络模型在特定优化方法中存在的缺陷(如临时性、贪婪策略或不可扩展性)。我们将以CSN的特殊变体TreeNet为例进行阐释,并通过与XGBoost和FCNN的对比展示TreeNet的优势。我们相信CSN将为实践者提供一个灵活便捷的框架,用以构建高性能、稳健且更具可解释性的模型。