Self-supervised representation learning on text-attributed graphs, which aims to create expressive and generalizable representations for various downstream tasks, has received increasing research attention lately. However, existing methods either struggle to capture the full extent of structural context information or rely on task-specific training labels, which largely hampers their effectiveness and generalizability in practice. To solve the problem of self-supervised representation learning on text-attributed graphs, we develop a novel Graph-Centric Language model -- GRENADE. Specifically, GRENADE exploits the synergistic effect of both pre-trained language model and graph neural network by optimizing with two specialized self-supervised learning algorithms: graph-centric contrastive learning and graph-centric knowledge alignment. The proposed graph-centric self-supervised learning algorithms effectively help GRENADE to capture informative textual semantics as well as structural context information on text-attributed graphs. Through extensive experiments, GRENADE shows its superiority over state-of-the-art methods. Implementation is available at \url{https://github.com/bigheiniu/GRENADE}.
翻译:文本属性图上的自监督表示学习旨在为各类下游任务生成具有表达力和泛化能力的表示,近期受到越来越多研究关注。然而,现有方法要么难以充分捕捉结构上下文信息,要么依赖特定任务的训练标签,这严重限制了其在实际应用中的有效性和泛化能力。为解决文本属性图上的自监督表示学习问题,我们提出了一种新颖的图中心语言模型——GRENADE。具体而言,GRENADE通过两种专门设计的自监督学习算法——图中心对比学习与图中心知识对齐——来优化,从而挖掘预训练语言模型与图神经网络的协同效应。所提出的图中心自监督学习算法有效帮助GRENADE在文本属性图上捕捉富含信息的文本语义以及结构上下文信息。通过大量实验,GRENADE展现了其相较于现有最优方法的优越性。实现代码已开源至\url{https://github.com/bigheiniu/GRENADE}。