The design and development of text-based knowledge graph completion (KGC) methods leveraging textual entity descriptions are at the forefront of research. These methods involve advanced optimization techniques such as soft prompts and contrastive learning to enhance KGC models. The effectiveness of text-based methods largely hinges on the quality and richness of the training data. Large language models (LLMs) can utilize straightforward prompts to alter text data, thereby enabling data augmentation for KGC. Nevertheless, LLMs typically demand substantial computational resources. To address these issues, we introduce a framework termed constrained prompts for KGC (CP-KGC). This CP-KGC framework designs prompts that adapt to different datasets to enhance semantic richness. Additionally, CP-KGC employs a context constraint strategy to effectively identify polysemous entities within KGC datasets. Through extensive experimentation, we have verified the effectiveness of this framework. Even after quantization, the LLM (Qwen-7B-Chat-int4) still enhances the performance of text-based KGC methods \footnote{Code and datasets are available at \href{https://github.com/sjlmg/CP-KGC}{https://github.com/sjlmg/CP-KGC}}. This study extends the performance limits of existing models and promotes further integration of KGC with LLMs.
翻译:利用文本实体描述的文本知识图谱补全方法的设计与开发是当前研究的前沿。这些方法涉及软提示和对比学习等先进优化技术,以增强KGC模型。文本方法的有效性在很大程度上取决于训练数据的质量和丰富性。大语言模型可以利用简单的提示来修改文本数据,从而为KGC实现数据增强。然而,LLMs通常需要大量的计算资源。为了解决这些问题,我们提出了一个称为KGC约束提示的框架。该CP-KGC框架设计了适应不同数据集的提示,以增强语义丰富性。此外,CP-KGC采用上下文约束策略,以有效识别KGC数据集中的多义实体。通过大量实验,我们验证了该框架的有效性。即使在量化后,LLM仍能提升文本KGC方法的性能\footnote{代码和数据集可在\href{https://github.com/sjlmg/CP-KGC}{https://github.com/sjlmg/CP-KGC}获取}。本研究拓展了现有模型的性能极限,并促进了KGC与LLMs的进一步融合。