In this work, we explore the use of Large Language Models (LLMs) for knowledge engineering tasks in the context of the ISWC 2023 LM-KBC Challenge. For this task, given subject and relation pairs sourced from Wikidata, we utilize pre-trained LLMs to produce the relevant objects in string format and link them to their respective Wikidata QIDs. We developed a pipeline using LLMs for Knowledge Engineering (LLMKE), combining knowledge probing and Wikidata entity mapping. The method achieved a macro-averaged F1-score of 0.701 across the properties, with the scores varying from 1.00 to 0.328. These results demonstrate that the knowledge of LLMs varies significantly depending on the domain and that further experimentation is required to determine the circumstances under which LLMs can be used for automatic Knowledge Base (e.g., Wikidata) completion and correction. The investigation of the results also suggests the promising contribution of LLMs in collaborative knowledge engineering. LLMKE won Track 2 of the challenge. The implementation is available at https://github.com/bohuizhang/LLMKE.
翻译:在本研究中,我们探索了在ISWC 2023 LM-KBC挑战赛背景下,将大语言模型用于知识工程任务的方法。针对该任务,我们基于Wikidata提供的主题-关系对,利用预训练的大语言模型以字符串形式生成相关对象,并将其链接至对应的Wikidata QID标识符。我们开发了一套结合知识探测与Wikidata实体映射的大语言模型知识工程(LLMKE)流水线。该方法在各属性上的宏平均F1分数达到0.701,分数范围从1.00到0.328不等。结果表明,大语言模型的知识分布因领域不同存在显著差异,需进一步实验以确定其可自动完成和修正知识库(如Wikidata)的条件。研究结果也表明,大语言模型在协作式知识工程中具有重要潜力。LLMKE在挑战赛第二赛道中夺冠。相关实现代码已开源至https://github.com/bohuizhang/LLMKE。