Natural Language Generation (NLG) has improved exponentially in recent years thanks to the development of sequence-to-sequence deep learning technologies such as Transformer-based language models. This advancement has led to more fluent and coherent NLG, leading to improved development in downstream tasks such as abstractive summarization, dialogue generation and data-to-text generation. However, it is also apparent that deep learning based generation is prone to hallucinate unintended text, which degrades the system performance and fails to meet user expectations in many real-world scenarios. To address this issue, many studies have been presented in measuring and mitigating hallucinated texts, but these have never been reviewed in a comprehensive manner before. In this survey, we thus provide a broad overview of the research progress and challenges in the hallucination problem in NLG. The survey is organized into two parts: (1) a general overview of metrics, mitigation methods, and future directions; (2) an overview of task-specific research progress on hallucinations in the following downstream tasks, namely abstractive summarization, dialogue generation, generative question answering, data-to-text generation, machine translation, and visual-language generation; and (3) hallucinations in large language models (LLMs). This survey serves to facilitate collaborative efforts among researchers in tackling the challenge of hallucinated texts in NLG.
翻译:自然语言生成(NLG)近年来因基于Transformer的语言模型等序列到序列深度学习技术的发展而取得指数级进步。这一进展使NLG输出更加流畅连贯,从而推动了抽象式摘要、对话生成和数据到文本生成等下游任务的改进。然而,基于深度学习的生成过程也明显倾向于产生非预期的幻觉文本,这会降低系统性能,并在许多现实场景中无法满足用户期望。针对这一问题,已有大量研究致力于测量和缓解幻觉文本,但此前从未有过全面综述。本综述因此对NLG中幻觉问题的研究进展与挑战进行了广泛梳理,从三个维度展开:(1) 通用指标、缓解方法与未来方向概述;(2) 以下下游任务中幻觉问题研究专述:抽象式摘要、对话生成、生成式问答、数据到文本生成、机器翻译及视觉语言生成;(3) 大语言模型(LLM)中的幻觉现象。本综述旨在促进研究人员协同攻克NLG中的幻觉文本难题。