While language models (LMs) can sometimes generate factually correct text and estimate truth values of individual claims, these generally do not reflect a globally coherent, manipulable model of the world. As a consequence, current LMs also generate incorrect or nonsensical content, and are difficult to edit and bring up to date. We present a method called Deductive Closure Training (DCT) that uses LMs themselves to identify implications of (and contradictions within) the text that they generate, yielding an efficient self-supervised procedure for improving LM factuality. Given a collection of seed documents, DCT prompts LMs to generate additional text implied by these documents, reason globally about the correctness of this generated text, and finally fine-tune on text inferred to be correct. Given seed documents from a trusted source, DCT provides a tool for supervised model updating; if seed documents are sampled from the LM itself, DCT enables fully unsupervised fine-tuning for improved coherence and accuracy. Across the CREAK, MQUaKE, and Reversal Curse datasets, supervised DCT improves LM fact verification and text generation accuracy by 3-26%; on CREAK fully unsupervised DCT improves verification accuracy by 12%. These results show that LMs' reasoning capabilities during inference can be leveraged during training to improve their reliability.
翻译:尽管语言模型(LM)有时能生成事实正确的文本并评估单个声明的真值,但这些结果通常无法反映全局一致、可操作的世界模型。因此,现有LM仍会生成错误或无意义的内容,且难以编辑和实时更新。我们提出一种名为“演绎封闭训练”(DCT)的方法,该方法利用LM自身识别其生成文本中的蕴含关系(及矛盾之处),从而建立一种有效的自监督流程以提高LM的事实准确性。给定一组种子文档,DCT引导LM生成这些文档所隐含的附加文本,对生成文本的正确性进行全局推理,最后对推理为正确的文本进行微调。当种子文档来自可信来源时,DCT为受监控的模型更新提供工具;若种子文档由LM自身采样,DCT则能实现完全无监督的微调,以提升连贯性与准确性。在CREAK、MQUaKE和Reversal Curse数据集上,受监控的DCT将LM事实验证与文本生成准确率提升了3%-26%;在CREAK上,完全无监督的DCT使验证准确率提高了12%。这些结果表明,LM在推理阶段的推理能力可被用于训练阶段,从而提升其可靠性。