Automated fact-checking has been a challenging task for the research community. Prior work has explored various strategies, such as end-to-end training, retrieval-augmented generation, and prompt engineering, to build robust fact-checking systems. However, their accuracy has not been high enough for real-world deployment. We, on the other hand, propose a new learning paradigm, where evidence classification and entailed justifications made by generative language models (GLMs) are used to train encoder-only language models (ELMs). We conducted a rigorous set of experiments, comparing our approach with recent works along with various prompting and fine-tuning strategies. Additionally, we performed ablation studies, error analysis, quality analysis of model explanations, and a domain generalisation study to provide a comprehensive understanding of our approach.
翻译:自动事实核查一直是研究界面临的一项挑战性任务。先前的研究探索了多种策略,如端到端训练、检索增强生成和提示工程,以构建稳健的事实核查系统。然而,这些方法的准确性尚未达到实际部署的要求。我们则提出了一种新的学习范式,利用生成式语言模型(GLMs)生成的证据分类和蕴含论证来训练仅编码器语言模型(ELMs)。我们进行了一系列严谨的实验,将我们的方法与近期研究以及多种提示和微调策略进行了比较。此外,我们还进行了消融研究、错误分析、模型解释的质量分析以及领域泛化研究,以全面理解我们提出的方法。