Multimodal misinformation on online social platforms is becoming a critical concern due to increasing credibility and easier dissemination brought by multimedia content, compared to traditional text-only information. While existing multimodal detection approaches have achieved high performance, the lack of interpretability hinders these systems' reliability and practical deployment. Inspired by NeuralSymbolic AI which combines the learning ability of neural networks with the explainability of symbolic learning, we propose a novel logic-based neural model for multimodal misinformation detection which integrates interpretable logic clauses to express the reasoning process of the target task. To make learning effective, we parameterize symbolic logical elements using neural representations, which facilitate the automatic generation and evaluation of meaningful logic clauses. Additionally, to make our framework generalizable across diverse misinformation sources, we introduce five meta-predicates that can be instantiated with different correlations. Results on three public datasets (Twitter, Weibo, and Sarcasm) demonstrate the feasibility and versatility of our model.
翻译:在线社交平台上的多模态虚假信息正日益成为关键问题,因为与传统的纯文本信息相比,多媒体内容带来的可信度提升和更便捷的传播方式加剧了这一问题。尽管现有的多模态检测方法已取得较高性能,但缺乏可解释性阻碍了这些系统的可靠性与实际部署。受结合神经网络学习能力与符号学习可解释性的神经符号人工智能启发,我们提出一种新颖的基于逻辑的神经模型,用于多模态虚假信息检测。该模型集成了可解释的逻辑子句,以表达目标任务的推理过程。为提升学习效率,我们利用神经表示对符号逻辑元素进行参数化,从而促进有意义逻辑子句的自动生成与评估。此外,为使框架能够泛化至多种虚假信息来源,我们引入了五个可依据不同关联性进行实例化的元谓词。在三个公开数据集(Twitter、微博和Sarcasm)上的实验结果验证了我们模型的可行性与多功能性。