Existing benchmarks for fake news detection have significantly contributed to the advancement of models in assessing the authenticity of news content. However, these benchmarks typically focus solely on news pertaining to a single semantic topic or originating from a single platform, thereby failing to capture the diversity of multi-domain news in real scenarios. In order to understand fake news across various domains, the external knowledge and fine-grained annotations are indispensable to provide precise evidence and uncover the diverse underlying strategies for fabrication, which are also ignored by existing benchmarks. To address this gap, we introduce a novel multi-domain knowledge-enhanced benchmark with fine-grained annotations, named \textbf{FineFake}. FineFake encompasses 16,909 data samples spanning six semantic topics and eight platforms. Each news item is enriched with multi-modal content, potential social context, semi-manually verified common knowledge, and fine-grained annotations that surpass conventional binary labels. Furthermore, we formulate three challenging tasks based on FineFake and propose a knowledge-enhanced domain adaptation network. Extensive experiments are conducted on FineFake under various scenarios, providing accurate and reliable benchmarks for future endeavors. The entire FineFake project is publicly accessible as an open-source repository at \url{https://github.com/Accuser907/FineFake}.
翻译:现有假新闻检测基准数据集显著推动了新闻真实性评估模型的发展,但这些基准通常仅关注单一语义主题或单一平台的新闻,未能覆盖真实场景中多领域新闻的多样性。为理解跨领域的假新闻,外部知识与细粒度标注对提供精确证据并揭示多样化的造假策略至关重要,而现有基准数据集恰恰忽视了这些要素。为此,我们提出了一个融合细粒度标注的多领域知识增强基准数据集——\textbf{FineFake}。FineFake包含覆盖六个语义主题和八个平台的16,909条数据样本,每条新闻均配有多模态内容、潜在社会背景信息、半人工验证的常识知识,以及超越传统二分类标注的细粒度注释。基于FineFake,我们进一步设计了三项具有挑战性的任务,并提出了一种知识增强的领域自适应网络。我们在多种场景下对FineFake进行了大量实验,为后续研究提供了准确可靠的基准。整个FineFake项目已作为开源代码库公开访问,地址为\url{https://github.com/Accuser907/FineFake}。