Escalating proliferation of inorganic accounts, commonly known as bots, within the digital ecosystem represents an ongoing and multifaceted challenge to online security, trustworthiness, and user experience. These bots, often employed for the dissemination of malicious propaganda and manipulation of public opinion, wield significant influence in social media spheres with far-reaching implications for electoral processes, political campaigns and international conflicts. Swift and accurate identification of inorganic accounts is of paramount importance in mitigating their detrimental effects. This research paper focuses on the identification of such accounts and explores various effective methods for their detection through machine learning techniques. In response to the pervasive presence of bots in the contemporary digital landscape, this study extracts temporal and semantic features from tweet behaviors and proposes a bot detection algorithm utilizing fundamental machine learning approaches, including Support Vector Machines (SVM) and k-means clustering. Furthermore, the research ranks the importance of these extracted features for each detection technique and also provides uncertainty quantification using a distribution free method, called the conformal prediction, thereby contributing to the development of effective strategies for combating the prevalence of inorganic accounts in social media platforms.
翻译:数字生态系统中无机账户(通常称为机器人)的激增,对在线安全性、可信度及用户体验构成了持续且多方面的挑战。这些机器人常被用于传播恶意宣传和操纵公众舆论,在社交媒体领域具有重大影响力,对选举进程、政治运动和国际冲突产生深远影响。快速准确地识别无机账户对于减轻其有害影响至关重要。本研究聚焦于此类账户的识别,并探索通过机器学习技术进行检测的各种有效方法。针对当代数字环境中机器人的普遍存在,本研究从推文行为中提取时序与语义特征,提出了一种基于基础机器学习方法(包括支持向量机(SVM)和k均值聚类)的机器人检测算法。此外,研究对每种检测技术中提取特征的重要性进行排序,并通过一种无需分布假设的方法——即共形预测——提供不确定性量化,从而为制定应对社交媒体平台中无机账户泛滥的有效策略作出贡献。