At the end of October 2022, Elon Musk concluded his acquisition of Twitter. In the weeks and months before that, several questions were publicly discussed that were not only of interest to the platform's future buyers, but also of high relevance to the Computational Social Science research community. For example, how many active users does the platform have? What percentage of accounts on the site are bots? And, what are the dominating topics and sub-topical spheres on the platform? In a globally coordinated effort of 80 scholars to shed light on these questions, and to offer a dataset that will equip other researchers to do the same, we have collected all 375 million tweets published within a 24-hour time period starting on September 21, 2022. To the best of our knowledge, this is the first complete 24-hour Twitter dataset that is available for the research community. With it, the present work aims to accomplish two goals. First, we seek to answer the aforementioned questions and provide descriptive metrics about Twitter that can serve as references for other researchers. Second, we create a baseline dataset for future research that can be used to study the potential impact of the platform's ownership change.
翻译:2022年10月底,埃隆·马斯克完成了对推特的收购。在此前的数周至数月间,若干公开讨论的问题不仅关乎该平台的未来买家,也与计算社会科学研究界高度相关。例如,该平台拥有多少活跃用户?平台中机器人账户占比几何?以及,平台上的主导话题和子话题领域有哪些?为解答这些问题,并为其他研究者提供相关研究数据集,80位学者展开全球协同行动,收集了自2022年9月21日起24小时内发布的所有3.75亿条推文。据我们所知,这是首个可供研究界使用的完整24小时推特数据集。本研究旨在实现两个目标:首先,通过回答上述问题,提供可作为其他研究者参考的推特描述性指标;其次,为未来研究创建基线数据集,以探究平台所有权变更的潜在影响。