Grounded generation aims to equip language models (LMs) with the ability to produce more credible and accountable responses by accurately citing verifiable sources. However, existing methods, by either feeding LMs with raw or preprocessed materials, remain prone to errors. To address this, we introduce CaLM, a novel verification framework. CaLM leverages the insight that a robust grounded response should be consistent with information derived solely from its cited sources. Our framework empowers smaller LMs, which rely less on parametric memory and excel at processing relevant information given a query, to validate the output of larger LMs. Larger LM responses that closely align with the smaller LMs' output, which relies exclusively on cited documents, are verified. Responses showing discrepancies are iteratively refined through a feedback loop. Experiments on three open-domain question-answering datasets demonstrate significant performance gains of 1.5% to 7% absolute average without any required model fine-tuning.
翻译:基于事实的生成旨在通过准确引用可验证来源,使语言模型能够生成更可信和可问责的回应。然而,现有方法无论是向语言模型提供原始材料还是预处理材料,仍然容易产生错误。为解决这一问题,我们提出了CaLM,一种新颖的验证框架。CaLM基于这样的洞见:一个稳健的基于事实的回应应与其引用来源所推导出的信息保持一致。我们的框架使较小语言模型能够验证较大语言模型的输出,因为较小模型较少依赖参数记忆,且擅长在给定查询时处理相关信息。那些与较小模型输出(仅基于引用文档生成)高度一致的大型模型回应将通过验证。存在差异的回应则通过反馈循环进行迭代优化。在三个开放域问答数据集上的实验表明,该方法在无需任何模型微调的情况下,实现了1.5%至7%的绝对平均性能显著提升。