Automated fact-checking has been a challenging task for the research community. Past works tried various strategies, such as end-to-end training, retrieval-augmented generation, and prompt engineering, to build robust fact-checking systems. However, their accuracy was not high enough for real-world deployment. We, on the other hand, propose a new learning paradigm, where evidence classification and entailed justifications made by generative language models (GLMs) are used to train encoder-only language models (ELMs). We have conducted a rigorous set of experiments, comparing our approach with recent works along with various prompting and fine-tuning strategies. Additionally, we have conducted ablation studies, error analysis, quality analysis of model explanations, and a domain generalisation study to provide a comprehensive understanding of our approach.
翻译:自动化事实核查一直是研究界面临的一项挑战性任务。以往的研究尝试了多种策略,如端到端训练、检索增强生成和提示工程,以构建稳健的事实核查系统。然而,这些方法的准确性尚不足以满足实际部署需求。与此不同,我们提出了一种新的学习范式,该范式利用生成式语言模型(GLMs)生成的证据分类和蕴含论证来训练仅编码器语言模型(ELMs)。我们进行了一系列严谨的实验,将我们的方法与近期研究工作以及多种提示和微调策略进行了比较。此外,我们还进行了消融研究、错误分析、模型解释的质量分析以及领域泛化研究,以提供对我们方法的全面理解。