This study addresses the challenge of detecting semantic column types in relational tables, a key task in many real-world applications. While language models like BERT have improved prediction accuracy, their token input constraints limit the simultaneous processing of intra-table and inter-table information. We propose a novel approach using Graph Neural Networks (GNNs) to model intra-table dependencies, allowing language models to focus on inter-table information. Our proposed method not only outperforms existing state-of-the-art algorithms but also offers novel insights into the utility and functionality of various GNN types for semantic type detection. The code is available at https://github.com/hoseinzadeehsan/GAIT
翻译:本研究针对关系表格中语义列类型检测这一挑战展开,该任务在众多实际应用中具有关键作用。尽管BERT等语言模型提升了预测准确性,但其词元输入限制制约了表内与表间信息的同步处理。我们提出一种新型方法,通过图神经网络(GNNs)建模表内依赖关系,使语言模型得以专注于表间信息。所提方法不仅超越了现有最优算法,更揭示了各类图神经网络在语义类型检测中的实用价值与功能特性。相关代码已开源发布于 https://github.com/hoseinzadeehsan/GAIT