Recent progress in artificial intelligence (AI), particularly in the domain of large language models (LLMs), has resulted in powerful and versatile dual-use systems. Indeed, cognition can be put towards a wide variety of tasks, some of which can result in harm. This study investigates how LLMs can be used for spear phishing, a form of cybercrime that involves manipulating targets into divulging sensitive information. I first explore LLMs' ability to assist with the reconnaissance and message generation stages of a successful spear phishing attack, where I find that advanced LLMs are capable of improving cybercriminals' efficiency during these stages. To explore how LLMs can be used to scale spear phishing campaigns, I then create unique spear phishing messages for over 600 British Members of Parliament using OpenAI's GPT-3.5 and GPT-4 models. My findings reveal that these messages are not only realistic but also cost-effective, with each email costing only a fraction of a cent to generate. Next, I demonstrate how basic prompt engineering can circumvent safeguards installed in LLMs by the reinforcement learning from human feedback fine-tuning process, highlighting the need for more robust governance interventions aimed at preventing misuse. To address these evolving risks, I propose two potential solutions: structured access schemes, such as application programming interfaces, and LLM-based defensive systems.
翻译:近期人工智能(AI)领域的进展,尤其是大型语言模型(LLM)方面的突破,催生了功能强大且多用途的双重用途系统。事实上,认知能力可应用于广泛任务,其中某些任务可能造成危害。本研究探讨了LLM如何被用于鱼叉式网络钓鱼——一种通过操纵目标泄露敏感信息而实施的网络犯罪行为。我首先探究LLM在成功实施鱼叉式网络钓鱼攻击的侦察和消息生成阶段中的辅助能力,发现先进LLM能够提升网络犯罪分子在这些阶段的效率。为探索如何利用LLM扩展鱼叉式网络钓鱼活动,我随后使用OpenAI的GPT-3.5和GPT-4模型为超过600名英国议会议员生成定制化钓鱼信息。研究结果表明,这些信息不仅具有高度真实性,且成本低廉——每条电子邮件的生成成本仅需不到一美分。接着,我通过基础提示工程技术展示了如何规避LLM中由人类反馈强化学习微调过程设置的安全防护机制,凸显了采取更稳健治理措施以防止滥用的必要性。为应对这些持续演变的威胁,我提出两种潜在解决方案:结构化访问方案(如应用程序编程接口)与基于LLM的防御系统。