FACTIFY3M: A Benchmark for Multimodal Fact Verification with Explainability through 5W Question-Answering

Megha Chakraborty,Khusbu Pahwa,Anku Rani,Adarsh Mahor,Aditya Pakala,Arghya Sarkar,Harshit Dave,Ishan Paul,Janvita Reddy,Preethi Gurumurthy,Ritvik G,Samahriti Mukherjee,Shreyas Chatterjee,Kinjal Sensharma,Dwip Dalal,Suryavardan S,Shreyash Mishra,Parth Patwa,Aman Chadha,Amit Sheth,Amitava Das

from arxiv, arXiv admin note: text overlap with arXiv:2305.04329

Combating disinformation is one of the burning societal crises -- about 67% of the American population believes that disinformation produces a lot of uncertainty, and 10% of them knowingly propagate disinformation. Evidence shows that disinformation can manipulate democratic processes and public opinion, causing disruption in the share market, panic and anxiety in society, and even death during crises. Therefore, disinformation should be identified promptly and, if possible, mitigated. With approximately 3.2 billion images and 720,000 hours of video shared online daily on social media platforms, scalable detection of multimodal disinformation requires efficient fact verification. Despite progress in automatic text-based fact verification (e.g., FEVER, LIAR), the research community lacks substantial effort in multimodal fact verification. To address this gap, we introduce FACTIFY 3M, a dataset of 3 million samples that pushes the boundaries of the domain of fact verification via a multimodal fake news dataset, in addition to offering explainability through the concept of 5W question-answering. Salient features of the dataset include: (i) textual claims, (ii) ChatGPT-generated paraphrased claims, (iii) associated images, (iv) stable diffusion-generated additional images (i.e., visual paraphrases), (v) pixel-level image heatmap to foster image-text explainability of the claim, (vi) 5W QA pairs, and (vii) adversarial fake news stories.

翻译：打击虚假信息是当前最紧迫的社会危机之一——约67%的美国民众认为虚假信息导致了极大的不确定性，其中10%的人明知是虚假信息仍进行传播。证据表明，虚假信息能够操纵民主进程和公众舆论，引发股市动荡、社会恐慌与焦虑，甚至在危机期间导致人员死亡。因此，虚假信息应得到及时识别，并尽可能予以消解。鉴于社交媒体平台每天约有32亿张图片和72万小时视频被分享，多模态虚假信息的规模化检测需要高效的事实核查。尽管基于文本的自动事实核查（如FEVER、LIAR）已取得进展，研究界在多模态事实核查领域仍缺乏系统性工作。为填补这一空白，我们提出FACTIFY 3M数据集——包含300万个样本，通过多模态虚假新闻数据集推动事实核查领域发展，同时借助5W问答概念提供可解释性。该数据集的显著特征包括：（i）文本声明，（ii）ChatGPT生成的改写声明，（iii）关联图像，（iv）稳定扩散模型生成的附加图像（即视觉改写），（v）像素级图像热力图以增强声明图文可解释性，（vi）5W问答对，以及（vii）对抗性虚假新闻故事。