At the end of October 2022, Elon Musk concluded his acquisition of Twitter. In the weeks and months before that, several questions were publicly discussed that were not only of interest to the platform's future buyers, but also of high relevance to the Computational Social Science research community. For example, how many active users does the platform have? What percentage of accounts on the site are bots? And, what are the dominating topics and sub-topical spheres on the platform? In a globally coordinated effort of 80 scholars to shed light on these questions, and to offer a dataset that will equip other researchers to do the same, we have collected all 375 million tweets published within a 24-hour time period starting on September 21, 2022. To the best of our knowledge, this is the first complete 24-hour Twitter dataset that is available for the research community. With it, the present work aims to accomplish two goals. First, we seek to answer the aforementioned questions and provide descriptive metrics about Twitter that can serve as references for other researchers. Second, we create a baseline dataset for future research that can be used to study the potential impact of the platform's ownership change.
翻译:2022年10月底,埃隆·马斯克完成了对推特的收购。在此前数周至数月间,多项公开讨论的问题不仅关乎平台未来所有者利益,更对计算社会科学研究群体具有重大意义。例如:该平台拥有多少活跃用户?站点账户中机器人的占比几何?平台上主导性话题及子话题领域为何?为阐明这些问题,并构建可供其他研究者开展同类研究的数据库,我们组织了80位学者的全球协作,系统收集了自2022年9月21日起24小时内发布的所有3.75亿条推文。据我们所知,这是首个面向研究群体开放的完整24小时推特数据集。基于此数据,本研究旨在实现双重目标:其一,回答上述问题并提供可作为其他研究者参考基准的推特描述性指标;其二,为研究平台所有权变更潜在影响创建未来研究的基线数据集。