Large language models (LLMs) exhibit impressive capabilities in generating realistic text across diverse subjects. Concerns have been raised that they could be utilized to produce fake content with a deceptive intention, although evidence thus far remains anecdotal. This paper presents a case study about a Twitter botnet that appears to employ ChatGPT to generate human-like content. Through heuristics, we identify 1,140 accounts and validate them via manual annotation. These accounts form a dense cluster of fake personas that exhibit similar behaviors, including posting machine-generated content and stolen images, and engage with each other through replies and retweets. ChatGPT-generated content promotes suspicious websites and spreads harmful comments. While the accounts in the AI botnet can be detected through their coordination patterns, current state-of-the-art LLM content classifiers fail to discriminate between them and human accounts in the wild. These findings highlight the threats posed by AI-enabled social bots.
翻译:大型语言模型(LLMs)在跨不同主题生成逼真文本方面展现出显著能力。人们担忧这些模型可能被用于以欺骗意图生成虚假内容,尽管相关证据迄今仍属零散个案。本文报告了一例关于推特机器人网络的案例研究,该网络似乎利用ChatGPT生成类人内容。通过启发式方法,我们识别出1,140个账户,并通过人工标注验证了这些账户。这些账户构成了一个由虚假人格组成的密集集群,表现出相似行为,包括发布机器生成内容和盗用图像,并通过回复和转发相互互动。ChatGPT生成的内容会推广可疑网站并传播有害评论。尽管人工智能机器人网络中的账户可通过其协调模式被检测到,但现有最先进的LLM内容分类器无法在真实环境中区分它们与人类账户。这些发现凸显了人工智能赋能社交机器人构成的威胁。