Recent work shown the capability of Large Language Models (LLMs) to solve tasks related to Knowledge Graphs, such as Knowledge Graph Completion, even in Zero- or Few-Shot paradigms. However, they are known to hallucinate answers, or output results in a non-deterministic manner, thus leading to wrongly reasoned responses, even if they satisfy the user's demands. To highlight opportunities and challenges in knowledge graphs-related tasks, we experiment with two distinguished LLMs, namely Mixtral-8x7B-Instruct-v0.1, and gpt-3.5-turbo-0125, on Knowledge Graph Completion for static knowledge graphs, using prompts constructed following the TELeR taxonomy, in Zero- and One-Shot contexts, on a Task-Oriented Dialogue system use case. When evaluated using both strict and flexible metrics measurement manners, our results show that LLMs could be fit for such a task if prompts encapsulate sufficient information and relevant examples.
翻译:近期研究表明,大型语言模型(LLMs)能够解决与知识图谱相关的任务,例如知识图谱补全,甚至在零样本或少样本范式下也能实现。然而,已知这些模型会产生幻觉答案,或以非确定性的方式输出结果,从而导致推理错误的响应,即使这些响应满足了用户的需求。为了突显知识图谱相关任务中的机遇与挑战,我们在静态知识图谱补全任务中,针对任务导向对话系统的应用场景,在零样本和单样本情境下,采用基于TELeR分类体系构建的提示,对两个知名的大型语言模型(即Mixtral-8x7B-Instruct-v0.1和gpt-3.5-turbo-0125)进行了实验。通过严格和灵活的度量方式进行评估,我们的结果表明,如果提示中封装了足够的信息和相关示例,大型语言模型可能适合此类任务。