The advanced capabilities of Large Language Models (LLMs) have made them invaluable across various applications, from conversational agents and content creation to data analysis, research, and innovation. However, their effectiveness and accessibility also render them susceptible to abuse for generating malicious content, including phishing attacks. This study explores the potential of using four popular commercially available LLMs - ChatGPT (GPT 3.5 Turbo), GPT 4, Claude and Bard to generate functional phishing attacks using a series of malicious prompts. We discover that these LLMs can generate both phishing emails and websites that can convincingly imitate well-known brands, and also deploy a range of evasive tactics for the latter to elude detection mechanisms employed by anti-phishing systems. Notably, these attacks can be generated using unmodified, or "vanilla," versions of these LLMs, without requiring any prior adversarial exploits such as jailbreaking. As a countermeasure, we build a BERT based automated detection tool that can be used for the early detection of malicious prompts to prevent LLMs from generating phishing content attaining an accuracy of 97\% for phishing website prompts, and 94\% for phishing email prompts.
翻译:大型语言模型(LLM)的先进能力使其在各类应用中不可或缺,从对话代理、内容生成到数据分析、研究与创新。然而,其高效性和易用性也导致其易被滥用于生成恶意内容,包括网络钓鱼攻击。本研究探索了利用四种主流商用LLM——ChatGPT(GPT 3.5 Turbo)、GPT 4、Claude和Bard——通过一系列恶意提示生成功能性网络钓鱼攻击的潜在可能性。我们发现,这些LLM能够生成令人信服地模仿知名品牌的钓鱼邮件及钓鱼网站,并在后者中部署多种规避策略,以逃避反钓鱼系统的检测机制。值得注意的是,这些攻击可通过未经修改的"原始"LLM版本生成,无需事先进行越狱等对抗性利用。作为应对措施,我们构建了一个基于BERT的自动化检测工具,可早期检测恶意提示,从而阻止LLM生成钓鱼内容。该工具对钓鱼网站提示的准确率达到97%,对钓鱼邮件提示的准确率达到94%。