Escalating proliferation of inorganic accounts, commonly known as bots, within the digital ecosystem represents an ongoing and multifaceted challenge to online security, trustworthiness, and user experience. These bots, often employed for the dissemination of malicious propaganda and manipulation of public opinion, wield significant influence in social media spheres with far-reaching implications for electoral processes, political campaigns and international conflicts. Swift and accurate identification of inorganic accounts is of paramount importance in mitigating their detrimental effects. This research paper focuses on the identification of such accounts and explores various effective methods for their detection through machine learning techniques. In response to the pervasive presence of bots in the contemporary digital landscape, this study extracts temporal and semantic features from tweet behaviors and proposes a bot detection algorithm utilizing fundamental machine learning approaches, including Support Vector Machines (SVM) and k-means clustering. Furthermore, the research ranks the importance of these extracted features for each detection technique and also provides uncertainty quantification using a distribution free method, called the conformal prediction, thereby contributing to the development of effective strategies for combating the prevalence of inorganic accounts in social media platforms.
翻译:数字生态系统中无机账户(通常称为机器人)的激增,对在线安全、可信度及用户体验构成了持续且多方面的挑战。这些机器人账户常被用于传播恶意宣传和操纵公众舆论,在社交媒体领域具有显著影响力,对选举进程、政治运动和国际冲突产生深远影响。快速准确地识别无机账户对于减轻其有害影响至关重要。本研究聚焦于此类账户的识别,并通过机器学习技术探索多种有效的检测方法。针对当代数字环境中机器人的普遍存在,本研究从推文行为中提取时序与语义特征,并提出一种基于基础机器学习方法(包括支持向量机(SVM)和k-means聚类)的机器人检测算法。此外,研究评估了所提取特征对每种检测技术的重要性,并采用一种无需分布假设的方法——即符合性预测——进行不确定性量化,从而为制定应对社交媒体平台中无机账户泛滥的有效策略作出贡献。