Automatic fact-checking plays a crucial role in combating the spread of misinformation. Large Language Models (LLMs) and Instruction-Following variants, such as InstructGPT and Alpaca, have shown remarkable performance in various natural language processing tasks. However, their knowledge may not always be up-to-date or sufficient, potentially leading to inaccuracies in fact-checking. To address this limitation, we propose combining the power of instruction-following language models with external evidence retrieval to enhance fact-checking performance. Our approach involves leveraging search engines to retrieve relevant evidence for a given input claim. This external evidence serves as valuable supplementary information to augment the knowledge of the pretrained language model. Then, we instruct-tune an open-sourced language model, called LLaMA, using this evidence, enabling it to predict the veracity of the input claim more accurately. To evaluate our method, we conducted experiments on two widely used fact-checking datasets: RAWFC and LIAR. The results demonstrate that our approach achieves state-of-the-art performance in fact-checking tasks. By integrating external evidence, we bridge the gap between the model's knowledge and the most up-to-date and sufficient context available, leading to improved fact-checking outcomes. Our findings have implications for combating misinformation and promoting the dissemination of accurate information on online platforms. Our released materials are accessible at: https://thcheung.github.io/factllama.
翻译:自动化事实核查在遏制虚假信息传播中扮演关键角色。大型语言模型(LLMs)及其指令跟随变体(如InstructGPT和Alpaca)已在多种自然语言处理任务中展现出卓越性能,但其知识可能并非始终最新或充分,导致事实核查准确性不足。为解决这一局限,我们提出结合指令跟随语言模型的能力与外部证据检索以提升事实核查性能。该方法通过搜索引擎为给定输入主张检索相关证据,将外部证据作为补充信息增强预训练语言模型的知识。随后,我们利用这些证据对开源语言模型LLaMA进行指令微调,使其能更精准地预测输入主张的真实性。为评估方法效果,我们在两个广泛使用的事实核查数据集RAWFC和LIAR上开展实验,结果表明该方法在事实核查任务中达到领先性能。通过整合外部证据,我们弥合了模型知识与最新、最充分上下文之间的差距,从而显著改善事实核查效果。本研究对遏制虚假信息、促进在线平台准确信息传播具有重要启示。相关资源已开源至:https://thcheung.github.io/factllama。