Large Language Models (LLMs) have demonstrated remarkable human-level natural language generation capabilities. However, their potential to generate misinformation, often called the hallucination problem, poses a significant risk to their deployment. A common approach to address this issue is to retrieve relevant knowledge and fine-tune the LLM with the knowledge in its input. Unfortunately, this method incurs high training costs and may cause catastrophic forgetting for multi-tasking models. To overcome these limitations, we propose a knowledge-constrained decoding method called KCTS (Knowledge-Constrained Tree Search), which guides a frozen LM to generate text aligned with the reference knowledge at each decoding step using a knowledge classifier score and MCTS (Monte-Carlo Tree Search). To adapt the sequence-level knowledge classifier to token-level guidance, we also propose a novel token-level hallucination detection method called RIPA (Reward Inflection Point Approximation). Our empirical results on knowledge-grounded dialogue and abstractive summarization demonstrate the strength of KCTS as a plug-and-play, model-agnostic decoding method that can effectively reduce hallucinations in natural language generation.
翻译:大型语言模型(LLMs)已展现出卓越的人类级自然语言生成能力。然而,它们可能生成错误信息(常被称为幻觉问题)的潜在风险,对其部署构成了重大挑战。解决该问题的常见方法是检索相关知识,并将这些知识融入LLM的输入中进行微调。遗憾的是,这种方法会带来高昂的训练成本,并可能导致多任务模型出现灾难性遗忘。为克服这些局限,我们提出了一种名为KCTS(知识约束树搜索)的知识约束解码方法,该方法通过知识分类器得分和MCTS(蒙特卡洛树搜索)在每个解码步骤中引导冻结的语言模型生成与参考知识对齐的文本。为将序列级知识分类器适配到词元级引导,我们还提出了一种新颖的词元级幻觉检测方法——RIPA(奖励拐点近似法)。我们在知识引导对话和抽象摘要任务上的实验结果表明,KCTS作为一种即插即用、模型无关的解码方法,能够有效减少自然语言生成中的幻觉现象。