The representation learning on textual graph is to generate low-dimensional embeddings for the nodes based on the individual textual features and the neighbourhood information. Recent breakthroughs on pretrained language models and graph neural networks push forward the development of corresponding techniques. The existing works mainly rely on the cascaded model architecture: the textual features of nodes are independently encoded by language models at first; the textual embeddings are aggregated by graph neural networks afterwards. However, the above architecture is limited due to the independent modeling of textual features. In this work, we propose GraphFormers, where layerwise GNN components are nested alongside the transformer blocks of language models. With the proposed architecture, the text encoding and the graph aggregation are fused into an iterative workflow, {making} each node's semantic accurately comprehended from the global perspective. In addition, a {progressive} learning strategy is introduced, where the model is successively trained on manipulated data and original data to reinforce its capability of integrating information on graph. Extensive evaluations are conducted on three large-scale benchmark datasets, where GraphFormers outperform the SOTA baselines with comparable running efficiency.
翻译:文本图上的表示学习旨在基于节点自身的文本特征与邻域信息生成低维嵌入。近年来,预训练语言模型与图神经网络的突破性进展推动了相关技术的发展。现有工作主要依赖级联式模型架构:首先由语言模型独立编码节点的文本特征,随后通过图神经网络聚合文本嵌入。然而,上述架构受限于文本特征的独立建模。本文提出GraphFormers模型,该模型在语言模型的Transformer模块中逐层嵌套图神经网络组件。通过所提架构,文本编码与图聚合被融合为迭代工作流,使得每个节点的语义得以从全局视角准确理解。此外,本文引入渐进式学习策略,通过依次在加工数据与原始数据上训练模型,增强其整合图结构信息的能力。在三个大规模基准数据集上的广泛评估表明,GraphFormers在保持可比运行效率的同时,性能超越现有最优基线方法。