All industries are trying to leverage Artificial Intelligence (AI) based on their existing big data which is available in so called tabular form, where each record is composed of a number of heterogeneous continuous and categorical columns also known as features. Deep Learning (DL) has constituted a major breakthrough for AI in fields related to human skills like natural language processing, but its applicability to tabular data has been more challenging. More classical Machine Learning (ML) models like tree-based ensemble ones usually perform better. This paper presents a novel DL model using Graph Neural Network (GNN) more specifically Interaction Network (IN), for contextual embedding and modelling interactions among tabular features. Its results outperform those of a recently published survey with DL benchmark based on five public datasets, also achieving competitive results when compared to boosted-tree solutions.
翻译:各行业正试图利用基于其现有大数据的通用人工智能(AI),这些数据通常以表格形式存在,其中每条记录由若干异构的连续型和分类型列(即特征)构成。深度学习(DL)在自然语言处理等与人类技能相关的领域已成为AI的重大突破,但其在表格数据上的应用更具挑战性。传统的机器学习(ML)模型(如基于树的集成模型)通常表现更优。本文提出了一种新颖的深度学习模型,该模型采用图神经网络(GNN),更具体地说是交互网络(IN),用于对表格特征进行上下文嵌入及交互建模。基于五个公开数据集的实验结果优于近期发布的深度学习基准研究,同时在与梯度提升树方案的对比中达到了具有竞争力的表现。