Crowd Intelligence for Early Misinformation Prediction on Social Media

Misinformation spreads rapidly on social media, causing serious damage by influencing public opinion, promoting dangerous behavior, or eroding trust in reliable sources. It spreads too fast for traditional fact-checking, stressing the need for predictive methods. We introduce CROWDSHIELD, a crowd intelligence-based method for early misinformation prediction. We hypothesize that the crowd's reactions to misinformation reveal its accuracy. Furthermore, we hinge upon exaggerated assertions/claims and replies with particular positions/stances on the source post within a conversation thread. We employ Q-learning to capture the two dimensions -- stances and claims. We utilize deep Q-learning due to its proficiency in navigating complex decision spaces and effectively learning network properties. Additionally, we use a transformer-based encoder to develop a comprehensive understanding of both content and context. This multifaceted approach helps ensure the model pays attention to user interaction and stays anchored in the communication's content. We propose MIST, a manually annotated misinformation detection Twitter corpus comprising nearly 200 conversation threads with more than 14K replies. In experiments, CROWDSHIELD outperformed ten baseline systems, achieving an improvement of ~4% macro-F1 score. We conduct an ablation study and error analysis to validate our proposed model's performance. The source code and dataset are available at https://github.com/LCS2-IIITD/CrowdShield.git.

翻译：错误信息在社交媒体上迅速传播，通过影响公众舆论、助长危险行为或削弱对可靠来源的信任，造成严重损害。其传播速度之快使得传统事实核查难以应对，这凸显了预测性方法的必要性。我们提出CROWDSHIELD，一种基于群体智能的早期错误信息预测方法。我们假设群体对错误信息的反应能够揭示其准确性。进一步地，我们聚焦于对话线程中源帖文所包含的夸张断言/主张以及具有特定立场/态度的回复。我们采用Q学习来捕捉立场与主张这两个维度。由于深度Q学习擅长处理复杂决策空间并有效学习网络特性，我们采用了该方法。此外，我们使用基于Transformer的编码器来深入理解内容与上下文。这种多层面方法有助于确保模型关注用户交互，并始终扎根于传播内容本身。我们构建了MIST——一个包含近200个对话线程、超过1.4万条回复的人工标注Twitter错误信息检测语料库。实验表明，CROWDSHIELD在十种基线系统中表现最优，宏观F1分数提升约4%。我们通过消融实验和错误分析验证了所提模型的性能。源代码与数据集已公开于https://github.com/LCS2-IIITD/CrowdShield.git。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日