Recent Language Models (LMs) have shown impressive capabilities in generating texts with the knowledge internalized in parameters. Yet, LMs often generate the factually incorrect responses to the given queries, since their knowledge may be inaccurate, incomplete, and outdated. To address this problem, previous works propose to augment LMs with the knowledge retrieved from an external knowledge source. However, such approaches often show suboptimal text generation performance due to two reasons: 1) the model may fail to retrieve the knowledge relevant to the given query, or 2) the model may not faithfully reflect the retrieved knowledge in the generated text. To overcome these, we propose to verify the output and the knowledge of the knowledge-augmented LMs with a separate verifier, which is a small LM that is trained to detect those two types of errors through instruction-finetuning. Then, when the verifier recognizes an error, we can rectify it by either retrieving new knowledge or generating new text. Further, we use an ensemble of the outputs from different instructions with a single verifier to enhance the reliability of the verification processes. We validate the effectiveness of the proposed verification steps on multiple question answering benchmarks, whose results show that the proposed verifier effectively identifies retrieval and generation errors, allowing LMs to provide more factually correct outputs. Our code is available at https://github.com/JinheonBaek/KALMV.
翻译:近期语言模型(LMs)在利用参数内化的知识生成文本方面展现出令人瞩目的能力。然而,由于知识可能存在不准确、不完整或过时的问题,LMs往往会对给定查询生成事实性错误的回答。为解决该问题,先前研究提出通过检索外部知识源的知识来增强LMs。但这类方法常因两个原因表现出次优的文本生成性能:1)模型可能无法检索到与给定查询相关的知识,或2)模型在生成文本时未能忠实反映所检索到的知识。为克服这些局限,我们提出使用一个独立的验证器(即通过指令微调训练的小型语言模型)来检测知识增强型LMs的输出与知识中存在的两类错误。当验证器识别出错误时,可通过重新检索知识或重新生成文本进行纠正。此外,我们采用单一验证器对不同指令的输出进行集成,以提升验证过程的可靠性。我们在多个问答基准上验证了所提验证步骤的有效性,结果表明该验证器能有效识别检索错误与生成错误,使LMs能够提供更符合事实的输出。我们的代码发布在https://github.com/JinheonBaek/KALMV。