Hallucinations pose a significant challenge for the practical implementation of large language models (LLMs). The utilization of parametric knowledge in generating factual content is constrained by the limited knowledge of LLMs, potentially resulting in internal hallucinations. While incorporating external information can help fill knowledge gaps, it also introduces the risk of irrelevant information, thereby increasing the likelihood of external hallucinations. A careful and balanced integration of the parametric knowledge within LLMs with external information is crucial to alleviate hallucinations. In this study, we present Rowen, a novel approach that enhances LLMs with a selective retrieval augmentation process tailored to address hallucinated outputs. This process is governed by a multilingual semantic-aware detection module, which evaluates the consistency of the perturbed responses across various languages for the same queries. Upon detecting inconsistencies indicative of hallucinations, Rowen activates the retrieval of external information to rectify the model outputs. Rowen adeptly harmonizes the intrinsic parameters in LLMs with external knowledge sources, effectively mitigating hallucinations by ensuring a balanced integration of internal reasoning and external evidence. Through a comprehensive empirical analysis, we demonstrate that Rowen surpasses the current state-of-the-art in both detecting and mitigating hallucinated content within the outputs of LLMs.
翻译:幻觉问题是大语言模型实际部署中的重大挑战。生成事实性内容时对参数化知识的利用受限于大语言模型有限的知识储备,可能引发内部幻觉。虽然引入外部信息有助于填补知识空白,但也会带来无关信息的风险,从而增加外部幻觉的可能性。谨慎平衡地整合大语言模型内部的参数化知识与外部信息,对于缓解幻觉至关重要。本研究提出Rowen——一种通过选择性检索增强流程来优化大语言模型处理幻觉输出的新方法。该流程由多语言语义感知检测模块驱动,通过评估同一查询在不同语言下扰动响应的一致性,在检测到指示幻觉的不一致现象时激活外部信息检索以纠正模型输出。Rowen巧妙协调大语言模型内在参数与外部知识源,通过确保内部推理与外部证据的均衡整合有效缓解幻觉。综合实证分析表明,Rowen在检测和缓解大语言模型输出中的幻觉内容方面均超越了当前最先进的水平。