TwiBot-22: Towards Graph-Based Twitter Bot Detection

Shangbin Feng,Zhaoxuan Tan,Herun Wan,Ningnan Wang,Zilong Chen,Binchi Zhang,Qinghua Zheng,Wenqian Zhang,Zhenyu Lei,Shujie Yang,Xinshun Feng,Qingyue Zhang,Hongrui Wang,Yuhan Liu,Yuyang Bai,Heng Wang,Zijian Cai,Yanbo Wang,Lijing Zheng,Zihan Ma,Jundong Li,Minnan Luo

from arxiv, NeurIPS 2022, Datasets and Benchmarks Track

Twitter bot detection has become an increasingly important task to combat misinformation, facilitate social media moderation, and preserve the integrity of the online discourse. State-of-the-art bot detection methods generally leverage the graph structure of the Twitter network, and they exhibit promising performance when confronting novel Twitter bots that traditional methods fail to detect. However, very few of the existing Twitter bot detection datasets are graph-based, and even these few graph-based datasets suffer from limited dataset scale, incomplete graph structure, as well as low annotation quality. In fact, the lack of a large-scale graph-based Twitter bot detection benchmark that addresses these issues has seriously hindered the development and evaluation of novel graph-based bot detection approaches. In this paper, we propose TwiBot-22, a comprehensive graph-based Twitter bot detection benchmark that presents the largest dataset to date, provides diversified entities and relations on the Twitter network, and has considerably better annotation quality than existing datasets. In addition, we re-implement 35 representative Twitter bot detection baselines and evaluate them on 9 datasets, including TwiBot-22, to promote a fair comparison of model performance and a holistic understanding of research progress. To facilitate further research, we consolidate all implemented codes and datasets into the TwiBot-22 evaluation framework, where researchers could consistently evaluate new models and datasets. The TwiBot-22 Twitter bot detection benchmark and evaluation framework are publicly available at https://twibot22.github.io/

翻译：推特机器人检测已成为一项日益重要的任务，用于打击虚假信息、促进社交媒体管理和维护在线讨论的完整性。最先进的机器人检测方法通常利用推特网络的图结构，并在应对传统方法无法检测的新型推特机器人时展现出令人瞩目的性能。然而，现有的推特机器人检测数据集中，极少是基于图的，即便这少数图数据集也存在规模有限、图结构不完整以及标注质量低下等问题。事实上，缺乏一个能够解决这些问题的、大规模基于图的推特机器人检测基准，严重阻碍了新型图方法机器人的开发与评估。在本文中，我们提出了TwiBot-22，这是一个全面的基于图的推特机器人检测基准，提供了迄今为止最大的数据集，包含推特网络上多样化的实体与关系，且标注质量显著优于现有数据集。此外，我们重新实现了35个代表性的推特机器人检测基线方法，并在包括TwiBot-22在内的9个数据集上对其进行了评估，以促进模型性能的公平比较和研究进展的全面理解。为便于进一步研究，我们将所有实现的代码和数据集整合到TwiBot-22评估框架中，研究人员可在此框架中一致地评估新模型和新数据集。TwiBot-22推特机器人检测基准与评估框架在https://twibot22.github.io/ 上公开提供。