With the rise of short video platforms as prominent channels for news dissemination, major platforms in China have gradually evolved into fertile grounds for the proliferation of fake news. However, distinguishing short video rumors poses a significant challenge due to the substantial amount of information and shared features among videos, resulting in homogeneity. To address the dissemination of short video rumors effectively, our research group proposes a methodology encompassing multimodal feature fusion and the integration of external knowledge, considering the merits and drawbacks of each algorithm. The proposed detection approach entails the following steps: (1) creation of a comprehensive dataset comprising multiple features extracted from short videos; (2) development of a multimodal rumor detection model: first, we employ the Temporal Segment Networks (TSN) video coding model to extract video features, followed by the utilization of Optical Character Recognition (OCR) and Automatic Speech Recognition (ASR) to extract textual features. Subsequently, the BERT model is employed to fuse textual and video features; (3) distinction is achieved through contrast learning: we acquire external knowledge by crawling relevant sources and leverage a vector database to incorporate this knowledge into the classification output. Our research process is driven by practical considerations, and the knowledge derived from this study will hold significant value in practical scenarios, such as short video rumor identification and the management of social opinions.
翻译:随着短视频平台成为新闻传播的重要渠道,国内主流平台逐渐演变为虚假新闻滋生的温床。然而,由于短视频包含海量信息且存在显著特征共享导致的同质性问题,区分短视频谣言极具挑战性。为有效应对短视频谣言传播,本课题组综合各类算法的优缺点,提出了一种融合多模态特征与外部知识的方法论。该检测方法包含以下步骤:(1)构建包含短视频多维度特征的综合数据集;(2)开发多模态谣言检测模型:首先采用时序片段网络(TSN)视频编码模型提取视频特征,继而利用光学字符识别(OCR)与自动语音识别(ASR)提取文本特征,最后通过BERT模型融合文本与视频特征;(3)通过对比学习实现特征区分:通过爬取相关来源获取外部知识,并利用向量数据库将该知识融入分类输出。本项研究以实际需求为驱动,其研究成果在短视频谣言识别与社会舆情管理等领域具有重要实践价值。