While Large Language Models (LLMs) have proven to be exceptional on a variety of tasks after alignment, they may still produce responses that contradict the context or world knowledge confidently, a phenomenon known as ``hallucination''. In this paper, we demonstrate that reducing the inconsistency between the external knowledge encapsulated in the training data and the intrinsic knowledge inherited in the pretraining corpus could mitigate hallucination in alignment. Specifically, we introduce a novel knowledge consistent alignment (KCA) approach, which involves automatically formulating examinations based on external knowledge for accessing the comprehension of LLMs. For data encompassing knowledge inconsistency, KCA implements several simple yet efficient strategies for processing. We illustrate the superior performance of the proposed KCA approach in mitigating hallucinations across six benchmarks using LLMs of different backbones and scales. Furthermore, we confirm the correlation between knowledge inconsistency and hallucination, signifying the effectiveness of reducing knowledge inconsistency in alleviating hallucinations. Our code, model weights, and data are public at \url{https://github.com/fanqiwan/KCA}.
翻译:尽管大型语言模型(LLMs)在对齐后在各类任务上表现出色,但它们仍可能自信地生成与上下文或世界知识相矛盾的回答,这种现象被称为“幻觉”。本文证明,减少训练数据中包含的外部知识与预训练语料库中继承的内在知识之间的不一致性,能够有效缓解对齐过程中的幻觉。具体而言,我们提出了一种新颖的知识一致性对齐(KCA)方法,该方法基于外部知识自动构建测验,以评估LLMs对知识的理解。对于存在知识不一致性的数据,KCA实施了几种简单且高效的策略进行处理。我们在六个基准测试中,使用不同骨干网络和规模的LLMs展示了所提KCA方法在缓解幻觉方面的优越性能。此外,我们验证了知识不一致性与幻觉之间的相关性,这表明减少知识不一致性对于缓解幻觉的有效性。我们的代码、模型权重及数据已公开于\url{https://github.com/fanqiwan/KCA}。