Fact-checking based on commercial LLMs has become mainstream. Although these methods offer high explainability, it falls short in accuracy compared to traditional fine-tuning approaches, and data security is also a significant concern. In this paper, we propose a self-instruction based fine-tuning approach for fact-checking that balances accuracy and explainability. Our method consists of Data Augmentation and Improved DPO fine-tuning. The former starts by instructing the model to generate both positive and negative explanations based on claim-evidence pairs and labels, then sampling the dataset according to our customized difficulty standards. The latter employs our proposed improved DPO to fine-tune the model using the generated samples. We fine-tune the smallest-scale LLaMA-7B model and evaluate it on the challenging fact-checking datasets FEVEROUS and HOVER, utilizing four fine-tuning methods and three few-shot learning methods for comparison. The experiments demonstrate that our approach not only retains accuracy comparable to, or even surpassing, traditional fine-tuning methods, but also generates fluent explanation text. Moreover, it also exhibit high generalization performance. Our method is the first to leverage self-supervised learning for fact-checking and innovatively combines contrastive learning and improved DPO in fine-tuning LLMs, as shown in the experiments.
翻译:基于商用大语言模型的事实核查已成为主流方法。尽管这些方法具有较高的可解释性,但其准确性仍不及传统的微调方法,且数据安全性也存在显著隐患。本文提出一种基于自指导的微调方法用于事实核查,旨在平衡准确性与可解释性。该方法包含数据增强与改进的DPO微调两个阶段:前者首先指导模型基于声明-证据对及其标签生成正反双向解释,随后根据定制化的难度标准对数据集进行采样;后者采用我们提出的改进DPO方法,利用生成的样本对模型进行微调。我们在最小规模的LLaMA-7B模型上进行微调,并在具有挑战性的事实核查数据集FEVEROUS和HOVER上进行评估,同时采用四种微调方法与三种少样本学习方法进行对比实验。实验结果表明,我们的方法不仅保持了与传统微调方法相当甚至更优的准确性,还能生成流畅的解释文本,同时展现出优异的泛化性能。本研究首次将自监督学习应用于事实核查任务,并在大语言模型微调中创新性地结合了对比学习与改进的DPO方法,实验结果验证了该方案的有效性。