While Large Language Models (LLMs) have proven to be exceptional on a variety of tasks after alignment, they may still produce responses that contradict the context or world knowledge confidently, a phenomenon known as ``hallucination''. In this paper, we demonstrate that reducing the inconsistency between the external knowledge encapsulated in the training data and the intrinsic knowledge inherited in the pretraining corpus could mitigate hallucination in alignment. Specifically, we introduce a novel knowledge consistent alignment (KCA) approach, which involves automatically formulating examinations based on external knowledge for accessing the comprehension of LLMs. For data encompassing knowledge inconsistency, KCA implements several simple yet efficient strategies for processing. We illustrate the superior performance of the proposed KCA approach in mitigating hallucinations across six benchmarks using LLMs of different backbones and scales. Furthermore, we confirm the correlation between knowledge inconsistency and hallucination, signifying the effectiveness of reducing knowledge inconsistency in alleviating hallucinations. Our code, model weights, and data are public at \url{https://github.com/fanqiwan/KCA}.
翻译:大型语言模型(LLMs)在对齐后尽管在多种任务上表现出色,但仍可能自信地生成与上下文或世界知识相矛盾的响应,这种现象被称为“幻觉”。在本文中,我们证明减少训练数据中蕴含的外部知识与预训练语料中固有的内在知识之间的不一致性,能够减轻对齐中的幻觉。具体而言,我们提出了一种新颖的知识一致性对齐(KCA)方法,该方法基于外部知识自动构建测试题,以评估LLMs的理解能力。对于包含知识不一致性的数据,KCA采用了若干简单而高效的策略进行处理。我们在六项基准测试中使用不同架构和规模的LLM展示了所提出的KCA方法在减轻幻觉方面的优越性能。此外,我们证实了知识不一致性与幻觉之间的关联,表明减少知识不一致性在缓解幻觉中的有效性。我们的代码、模型权重和数据已在 \url{https://github.com/fanqiwan/KCA} 公开。