Cross Spline Net and a Unified World

In today's machine learning world for tabular data, XGBoost and fully connected neural network (FCNN) are two most popular methods due to their good model performance and convenience to use. However, they are highly complicated, hard to interpret, and can be overfitted. In this paper, we propose a new modeling framework called cross spline net (CSN) that is based on a combination of spline transformation and cross-network (Wang et al. 2017, 2021). We will show CSN is as performant and convenient to use, and is less complicated, more interpretable and robust. Moreover, the CSN framework is flexible, as the spline layer can be configured differently to yield different models. With different choices of the spline layer, we can reproduce or approximate a set of non-neural network models, including linear and spline-based statistical models, tree, rule-fit, tree-ensembles (gradient boosting trees, random forest), oblique tree/forests, multi-variate adaptive regression spline (MARS), SVM with polynomial kernel, etc. Therefore, CSN provides a unified modeling framework that puts the above set of non-neural network models under the same neural network framework. By using scalable and powerful gradient descent algorithms available in neural network libraries, CSN avoids some pitfalls (such as being ad-hoc, greedy or non-scalable) in the case-specific optimization methods used in the above non-neural network models. We will use a special type of CSN, TreeNet, to illustrate our point. We will compare TreeNet with XGBoost and FCNN to show the benefits of TreeNet. We believe CSN will provide a flexible and convenient framework for practitioners to build performant, robust and more interpretable models.

翻译：在当今面向表格数据的机器学习领域，XGBoost和全连接神经网络因其优异的模型性能和使用便捷性成为两种最主流的方法。然而，它们结构高度复杂、可解释性差且容易过拟合。本文提出了一种称为交叉样条网络的新型建模框架，该框架基于样条变换与交叉网络的组合。我们将证明CSN在保持高性能与使用便捷性的同时，具有结构更简洁、可解释性更强且更稳健的特点。此外，CSN框架具有高度灵活性，通过不同配置的样条层可衍生出多种模型变体。通过选择不同的样条层配置，我们能够复现或逼近一系列非神经网络模型，包括线性与基于样条的统计模型、决策树、规则拟合模型、树集成模型（梯度提升树、随机森林）、斜决策树/森林、多元自适应回归样条、多项式核支持向量机等。因此，CSN构建了一个统一的建模框架，将上述非神经网络模型纳入同一神经网络体系。借助神经网络库中可扩展且强大的梯度下降算法，CSN避免了上述非神经网络模型在特定优化方法中存在的缺陷（如临时性、贪婪策略或不可扩展性）。我们将以CSN的特殊变体TreeNet为例进行阐释，并通过与XGBoost和FCNN的对比展示TreeNet的优势。我们相信CSN将为实践者提供一个灵活便捷的框架，用以构建高性能、稳健且更具可解释性的模型。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日