On 24 February 2022, Russia invaded Ukraine, starting what is now known as the Russo-Ukrainian War, initiating an online discourse on social media. Twitter as one of the most popular SNs, with an open and democratic character, enables a transparent discussion among its large user base. Unfortunately, this often leads to Twitter's policy violations, propaganda, abusive actions, civil integrity violation, and consequently to user accounts' suspension and deletion. This study focuses on the Twitter suspension mechanism and the analysis of shared content and features of the user accounts that may lead to this. Toward this goal, we have obtained a dataset containing 107.7M tweets, originating from 9.8 million users, using Twitter API. We extract the categories of shared content of the suspended accounts and explain their characteristics, through the extraction of text embeddings in junction with cosine similarity clustering. Our results reveal scam campaigns taking advantage of trending topics regarding the Russia-Ukrainian conflict for Bitcoin and Ethereum fraud, spam, and advertisement campaigns. Additionally, we apply a machine learning methodology including a SHapley Additive explainability model to understand and explain how user accounts get suspended.
翻译:2022年2月24日,俄罗斯入侵乌克兰,引发了如今所称的俄乌战争,并在社交媒体上掀起了网络讨论。推特作为最受欢迎的社交网络平台之一,以其开放和民主的特性,使得庞大的用户群体能够进行透明的讨论。然而,这往往导致用户违反推特政策,出现宣传造势、辱骂行为、破坏公民诚信等行为,进而导致用户账号被封禁或删除。本研究聚焦于推特的账号封禁机制,并分析可能导致封禁的共享内容特征及用户账号属性。为此,我们通过推特API获取了一个包含1077万条推文的数据集,这些推文来自980万用户。我们通过提取文本嵌入并结合余弦相似度聚类方法,提取了被封禁账号的共享内容类别并解释其特性。研究结果揭示了利用俄乌冲突热门话题进行比特币和以太坊诈骗、垃圾信息及广告推广的骗局活动。此外,我们应用了包括SHapley可加解释模型在内的机器学习方法,以理解并解释用户账号如何被封禁。