Telegram is one of the most used instant messaging apps worldwide. Some of its success lies in providing high privacy protection and social network features like the channels -- virtual rooms in which only the admins can post and broadcast messages to all its subscribers. However, these same features contributed to the emergence of borderline activities and, as is common with Online Social Networks, the heavy presence of fake accounts. Telegram started to address these issues by introducing the verified and scam marks for the channels. Unfortunately, the problem is far from being solved. In this work, we perform a large-scale analysis of Telegram by collecting 35,382 different channels and over 130,000,000 messages. We study the channels that Telegram marks as verified or scam, highlighting analogies and differences. Then, we move to the unmarked channels. Here, we find some of the infamous activities also present on privacy-preserving services of the Dark Web, such as carding, sharing of illegal adult and copyright protected content. In addition, we identify and analyze two other types of channels: the clones and the fakes. Clones are channels that publish the exact content of another channel to gain subscribers and promote services. Instead, fakes are channels that attempt to impersonate celebrities or well-known services. Fakes are hard to identify even by the most advanced users. To detect the fake channels automatically, we propose a machine learning model that is able to identify them with an accuracy of 86%. Lastly, we study Sabmyk, a conspiracy theory that exploited fakes and clones to spread quickly on the platform reaching over 1,000,000 users.
翻译:Telegram是全球使用最广泛的即时通讯应用之一。其成功部分源于提供高度的隐私保护及类似社交网络的频道功能——虚拟房间中仅管理员可向所有订阅者发布和广播消息。然而,这些特性也催生了灰色活动,且与常见于在线社交网络的情况类似,平台上存在大量虚假账户。Telegram已开始通过为频道引入"已验证"和"诈骗"标识来应对这些问题。遗憾的是,该问题远未解决。本研究对Telegram进行大规模分析,收集了35,382个不同频道及超过1.3亿条消息。我们分析了Telegram标记为"已验证"或"诈骗"的频道,揭示其共性与差异。随后转向未标记频道,在此发现了暗网隐私保护服务中常见的非法活动,包括信用卡诈骗、非法成人内容及受版权保护内容的传播。此外,我们识别并分析了另外两类频道:克隆频道与仿冒频道。克隆频道通过完全复制其他频道内容来获取订阅者并推广服务;仿冒频道则试图伪装成名人与知名服务机构。即使是资深用户也难以识别仿冒频道。为此,我们提出一种机器学习模型,可自动检测仿冒频道,准确率达86%。最后,我们研究了Sabmyk阴谋论案例,该理论通过利用仿冒与克隆频道在平台上迅速传播,触达用户超100万人。