In this paper, we address the hallucination problem commonly found in natural language generation tasks. Language models often generate fluent and convincing content but can lack consistency with the provided source, resulting in potential inaccuracies. We propose a new decoding method called Fidelity-Enriched Contrastive Search (FECS), which augments the contrastive search framework with context-aware regularization terms. FECS promotes tokens that are semantically similar to the provided source while penalizing repetitiveness in the generated text. We demonstrate its effectiveness across two tasks prone to hallucination: abstractive summarization and dialogue generation. Results show that FECS consistently enhances faithfulness across various language model sizes while maintaining output diversity comparable to well-performing decoding algorithms.
翻译:本文针对自然语言生成任务中普遍存在的幻觉问题展开研究。语言模型虽能生成流畅且令人信服的内容,但常与给定源信息缺乏一致性,导致潜在的不准确性。我们提出了一种名为保真增强对比搜索(FECS)的新型解码方法,该方法通过引入上下文感知正则化项来增强对比搜索框架。FECS在促进选择与源信息语义相似的词元的同时,抑制生成文本中的重复现象。我们在易产生幻觉的两类任务——抽象式摘要生成与对话生成——中验证了其有效性。实验结果表明,FECS能在维持与高性能解码算法相当的输出多样性的前提下,跨不同规模的语言模型持续提升生成内容的忠实性。