The rapid spread of information through mobile devices and media has led to the widespread of false or deceptive news, causing significant concerns in society. Among different types of misinformation, image repurposing, also known as out-of-context misinformation, remains highly prevalent and effective. However, current approaches for detecting out-of-context misinformation often lack interpretability and offer limited explanations. In this study, we propose a logic regularization approach for out-of-context detection called LOGRAN (LOGic Regularization for out-of-context ANalysis). The primary objective of LOGRAN is to decompose the out-of-context detection at the phrase level. By employing latent variables for phrase-level predictions, the final prediction of the image-caption pair can be aggregated using logical rules. The latent variables also provide an explanation for how the final result is derived, making this fine-grained detection method inherently explanatory. We evaluate the performance of LOGRAN on the NewsCLIPpings dataset, showcasing competitive overall results. Visualized examples also reveal faithful phrase-level predictions of out-of-context images, accompanied by explanations. This highlights the effectiveness of our approach in addressing out-of-context detection and enhancing interpretability.
翻译:移动设备和媒体的信息快速传播导致虚假或欺骗性新闻广泛流传,引发社会重大关切。在各类虚假信息中,图像挪用(亦称上下文外虚假信息)仍然极为普遍且影响显著。然而,当前检测上下文外虚假信息的方法往往缺乏可解释性,且提供的解释有限。本研究提出一种用于上下文外检测的逻辑正则化方法LOGRAN(面向上下文外分析的逻辑正则化)。LOGRAN的核心目标是在短语层级解构上下文外检测任务。通过引入短语级预测的潜变量,图像-标题对的最终预测可利用逻辑规则进行聚合。这些潜变量同时为最终结果的推导过程提供解释,使得这种细粒度检测方法具备内在可解释性。我们在NewsCLIPpings数据集上评估LOGRAN的性能,展示了具有竞争力的综合结果。可视化示例进一步揭示了模型对上下文外图像能产生可靠的短语级预测,并附带解释说明。这凸显了本方法在解决上下文外检测任务和增强可解释性方面的有效性。