The ability of knowledge graphs to represent complex relationships at scale has led to their adoption for various needs including knowledge representation, question-answering, and recommendation systems. Knowledge graphs are often incomplete in the information they represent, necessitating the need for knowledge graph completion tasks. Pre-trained and fine-tuned language models have shown promise in these tasks although these models ignore the intrinsic information encoded in the knowledge graph, namely the entity and relation types. In this work, we propose the Knowledge Graph Language Model (KGLM) architecture, where we introduce a new entity/relation embedding layer that learns to differentiate distinctive entity and relation types, therefore allowing the model to learn the structure of the knowledge graph. In this work, we show that further pre-training the language models with this additional embedding layer using the triples extracted from the knowledge graph, followed by the standard fine-tuning phase sets a new state-of-the-art performance for the link prediction task on the benchmark datasets.
翻译:知识图谱能够大规模表示复杂关系,因此被广泛用于知识表示、问答系统和推荐系统等场景。然而,知识图谱所表示的信息往往不完整,这促生了知识图谱补全任务的需求。尽管预训练和微调的语言模型在此类任务中展现出潜力,但这些模型忽略了知识图谱中蕴含的内在信息,即实体和关系类型。本文提出了知识图谱语言模型(KGLM)架构,其中引入了一种新的实体/关系嵌入层,该层能够学习区分不同的实体和关系类型,从而使模型能够学习知识图谱的结构。研究表明,利用从知识图谱中抽取的三元组对此附加嵌入层进行进一步预训练,再结合标准的微调阶段,在基准数据集的链接预测任务上取得了新的最优性能。