Large language models (LLMs) have demonstrated impressive language understanding and generation capabilities, enabling them to answer a wide range of questions across various domains. However, these models are not flawless and often produce responses that contain errors or misinformation. These inaccuracies, commonly referred to as hallucinations, render LLMs unreliable and even unusable in many scenarios. In this paper, our focus is on mitigating the issue of hallucination in LLMs, particularly in the context of question-answering. Instead of attempting to answer all questions, we explore a refusal mechanism that instructs LLMs to refuse to answer challenging questions in order to avoid errors. We then propose a simple yet effective solution called Learn to Refuse (L2R), which incorporates the refusal mechanism to enable LLMs to recognize and refuse to answer questions that they find difficult to address. To achieve this, we utilize a structured knowledge base to represent all the LLM's understanding of the world, enabling it to provide traceable gold knowledge. This knowledge base is separate from the LLM and initially empty, and it is progressively expanded with validated knowledge. When an LLM encounters questions outside its domain, the system recognizes its knowledge scope and determines whether it can answer the question independently. Additionally, we introduce a method for automatically and efficiently expanding the knowledge base of LLMs. Through qualitative and quantitative analysis, we demonstrate that our approach enhances the controllability and reliability of LLMs.
翻译:大语言模型(LLMs)在语言理解与生成方面展现出卓越能力,使其能够回答跨领域的广泛问题。然而,这些模型并非完美无缺,常生成包含错误或虚假信息的回答。这些不准确性(通常称为“幻觉”)导致LLMs在诸多场景中不可靠甚至无法使用。本文聚焦于缓解LLMs的幻觉问题,特别是面向问答场景。我们探索一种拒绝机制,通过指导LLMs拒绝回答具有挑战性的问题来避免错误,而非试图回答所有问题。继而提出一种名为“学会拒绝”(Learn to Refuse,L2R)的简洁高效解决方案,该方案融入拒绝机制,使LLMs能够识别并拒绝回答自身难以处理的问题。为此,我们利用结构化知识库表征LLMs对世界的全部理解,使其能够提供可追溯的黄金知识。该知识库与LLM分离且初始为空,并通过经验证的知识逐步扩展。当LLM遇到领域外问题时,系统识别其知识范围并判定能否独立作答。此外,我们提出一种自动高效扩展LLMs知识库的方法。通过定性与定量分析,证明我们的方法提升了LLMs的可控性与可靠性。