Knowledge graph completion (KGC) is a task of inferring missing triples based on existing Knowledge Graphs (KGs). Both structural and semantic information are vital for successful KGC. However, existing methods only use either the structural knowledge from the KG embeddings or the semantic information from pre-trained language models (PLMs), leading to suboptimal model performance. Moreover, since PLMs are not trained on KGs, directly using PLMs to encode triples may be inappropriate. To overcome these limitations, we propose a novel framework called Bridge, which jointly encodes structural and semantic information of KGs. Specifically, we strategically encode entities and relations separately by PLMs to better utilize the semantic knowledge of PLMs and enable structured representation learning via a structural learning principle. Furthermore, to bridge the gap between KGs and PLMs, we employ a self-supervised representation learning method called BYOL to fine-tune PLMs with two different views of a triple. Unlike BYOL, which uses augmentation methods to create two semantically similar views of the same image, potentially altering the semantic information. We strategically separate the triple into two parts to create different views, thus avoiding semantic alteration. Experiments demonstrate that Bridge outperforms the SOTA models on three benchmark datasets.
翻译:知识图谱补全(KGC)是基于现有知识图谱(KGs)推断缺失三元组的任务。结构信息与语义信息对于成功的KGC都至关重要。然而,现有方法仅利用来自知识图谱嵌入的结构知识或来自预训练语言模型(PLMs)的语义信息,导致模型性能欠佳。此外,由于PLMs未在知识图谱上训练,直接使用PLMs编码三元组可能并不合适。为克服这些局限,我们提出一个名为Bridge的新型框架,该框架联合编码知识图谱的结构与语义信息。具体而言,我们通过PLMs策略性地分别编码实体与关系,以更好地利用PLMs的语义知识,并通过结构学习原则实现结构化表示学习。进一步地,为弥合知识图谱与PLMs之间的鸿沟,我们采用一种名为BYOL的自监督表示学习方法,利用三元组的两种不同视图对PLMs进行微调。与BYOL使用增强方法创建同一图像的两种语义相似视图(可能改变语义信息)不同,我们策略性地将三元组拆分为两部分以创建不同视图,从而避免语义改变。实验表明,Bridge在三个基准数据集上优于现有最优(SOTA)模型。