The goal of knowledge graph completion (KGC) is to predict missing facts among entities. Previous methods for KGC re-ranking are mostly built on non-generative language models to obtain the probability of each candidate. Recently, generative large language models (LLMs) have shown outstanding performance on several tasks such as information extraction and dialog systems. Leveraging them for KGC re-ranking is beneficial for leveraging the extensive pre-trained knowledge and powerful generative capabilities. However, it may encounter new problems when accomplishing the task, namely mismatch, misordering and omission. To this end, we introduce KC-GenRe, a knowledge-constrained generative re-ranking method based on LLMs for KGC. To overcome the mismatch issue, we formulate the KGC re-ranking task as a candidate identifier sorting generation problem implemented by generative LLMs. To tackle the misordering issue, we develop a knowledge-guided interactive training method that enhances the identification and ranking of candidates. To address the omission issue, we design a knowledge-augmented constrained inference method that enables contextual prompting and controlled generation, so as to obtain valid rankings. Experimental results show that KG-GenRe achieves state-of-the-art performance on four datasets, with gains of up to 6.7% and 7.7% in the MRR and Hits@1 metric compared to previous methods, and 9.0% and 11.1% compared to that without re-ranking. Extensive analysis demonstrates the effectiveness of components in KG-GenRe.
翻译:知识图谱补全(KGC)的目标是预测实体间缺失的事实。以往KGC重排序方法大多基于非生成式语言模型来获取每个候选的概率。近年来,生成式大语言模型(LLMs)在信息抽取和对话系统等多项任务中展现出卓越性能。将其应用于KGC重排序有助于利用丰富的预训练知识和强大的生成能力。然而,在执行该任务时可能遇到新问题,即不匹配、排序错误和遗漏。为此,我们提出KC-GenRe——一种基于LLMs的知识约束生成式重排序方法用于KGC。为解决不匹配问题,我们将KGC重排序任务形式化为由生成式LLMs实现的候选标识符排序生成问题。为应对排序错误问题,我们开发了知识引导的交互式训练方法,增强候选的识别与排序能力。为解决遗漏问题,我们设计了知识增强的约束推理方法,实现上下文提示和受控生成,从而获得有效排序。实验结果表明,KG-GenRe在四个数据集上达到最优性能,相较于先前方法在MRR和Hits@1指标上分别提升高达6.7%和7.7%,相较于无重排序方法提升9.0%和11.1%。广泛分析验证了KG-GenRe各组件的有效性。