Misinformation detection is a critical task that can benefit significantly from the integration of external knowledge, much like manual fact-checking. In this work, we propose a novel method for representing textual documents that facilitates the incorporation of information from a knowledge base. Our approach, Text Encoding with Graph (TEG), processes documents by extracting structured information in the form of a graph and encoding both the text and the graph for classification purposes. Through extensive experiments, we demonstrate that this hybrid representation enhances misinformation detection performance compared to using language models alone. Furthermore, we introduce TEGRA, an extension of our framework that integrates domain-specific knowledge, further enhancing classification accuracy in most cases.
翻译:虚假信息检测是一项关键任务,其性能可受益于外部知识的整合,类似于人工事实核查。本文提出一种新颖的文本文档表征方法,该方法便于从知识库中融入信息。我们提出的图结构文本编码方法通过提取图形式的结构化信息,并对文本与图结构进行联合编码以用于分类任务。通过大量实验,我们证明这种混合表征相较于单独使用语言模型能提升虚假信息检测性能。此外,我们进一步提出TEGRA框架,该框架通过整合领域特定知识,在多数情况下能进一步提升分类准确率。