Misinformation detection is a critical task that can benefit significantly from the integration of external knowledge, much like manual fact-checking. In this work, we propose a novel method for representing textual documents that facilitates the incorporation of information from a knowledge base. Our approach, Text Encoding with Graph (TEG), processes documents by extracting structured information in the form of a graph and encoding both the text and the graph for classification purposes. Through extensive experiments, we demonstrate that this hybrid representation enhances misinformation detection performance compared to using language models alone. Furthermore, we introduce TEGRA, an extension of our framework that integrates domain-specific knowledge, further enhancing classification accuracy in most cases.
翻译:虚假信息检测是一项关键任务,其可显著受益于外部知识的整合,类似于人工事实核查。在本工作中,我们提出了一种新颖的文本文档表示方法,该方法便于从知识库中融入信息。我们的方法——基于图的文本编码(TEG)——通过提取结构化信息构建图,并对文本和图进行编码以用于分类任务。通过大量实验,我们证明相较于单独使用语言模型,这种混合表示能够提升虚假信息检测的性能。此外,我们引入了TEGRA,作为我们框架的扩展,它整合了特定领域的知识,在大多数情况下进一步提高了分类准确率。