In this survey, we dive into Tabular Data Learning (TDL) using Graph Neural Networks (GNNs), a domain where deep learning-based approaches have increasingly shown superior performance in both classification and regression tasks compared to traditional methods. The survey highlights a critical gap in deep neural TDL methods: the underrepresentation of latent correlations among data instances and feature values. GNNs, with their innate capability to model intricate relationships and interactions between diverse elements of tabular data, have garnered significant interest and application across various TDL domains. Our survey provides a systematic review of the methods involved in designing and implementing GNNs for TDL (GNN4TDL). It encompasses a detailed investigation into the foundational aspects and an overview of GNN-based TDL methods, offering insights into their evolving landscape. We present a comprehensive taxonomy focused on constructing graph structures and representation learning within GNN-based TDL methods. In addition, the survey examines various training plans, emphasizing the integration of auxiliary tasks to enhance the effectiveness of instance representations. A critical part of our discussion is dedicated to the practical application of GNNs across a spectrum of GNN4TDL scenarios, demonstrating their versatility and impact. Lastly, we discuss the limitations and propose future research directions, aiming to spur advancements in GNN4TDL. This survey serves as a resource for researchers and practitioners, offering a thorough understanding of GNNs' role in revolutionizing TDL and pointing towards future innovations in this promising area.
翻译:本文综述深入探讨了利用图神经网络进行表格数据学习的研究领域。在该领域中,基于深度学习的方法在分类与回归任务上已日益展现出优于传统方法的性能。本综述指出现有深度神经网络表格数据学习方法中存在的一个关键缺陷:未能充分表征数据实例与特征值间的潜在关联。图神经网络凭借其建模表格数据中不同元素间复杂关系与交互的天然能力,已在多个表格数据学习领域获得广泛关注与应用。我们系统梳理了面向表格数据学习的图神经网络设计与实现方法,涵盖基础原理的详细探究与基于图神经网络的表格数据学习方法概览,揭示了该领域的演进脉络。本文提出了一套全面的分类体系,聚焦于基于图神经网络的表格数据学习方法中的图结构构建与表示学习两大核心环节。此外,本综述考察了多种训练方案,重点强调了通过整合辅助任务来增强实例表示有效性的方法。讨论的关键部分专门阐述了图神经网络在多种面向表格数据的图神经网络应用场景中的实践应用,展示了其通用性与影响力。最后,我们探讨了现有局限并提出了未来研究方向,旨在推动该领域的进一步发展。本综述为研究人员与从业者提供了重要参考资源,有助于深入理解图神经网络在革新表格数据学习中的角色,并指明该前沿领域的创新方向。