Phishing remains a critical cybersecurity threat, especially with the advent of large language models (LLMs) capable of generating highly convincing malicious content. Unlike earlier phishing attempts which are identifiable by grammatical errors, misspellings, incorrect phrasing, and inconsistent formatting, LLM generated emails are grammatically sound, contextually relevant, and linguistically natural. These advancements make phishing emails increasingly difficult to distinguish from legitimate ones, challenging traditional detection mechanisms. Conventional phishing detection systems often fail when faced with emails crafted by LLMs or manipulated using adversarial perturbation techniques. To address this challenge, we propose a robust phishing email detection system featuring an enhanced text preprocessing pipeline. This pipeline includes spelling correction and word splitting to counteract adversarial modifications and improve detection accuracy. Our approach integrates widely adopted natural language processing (NLP) feature extraction techniques and machine learning algorithms. We evaluate our models on publicly available datasets comprising both phishing and legitimate emails, achieving a detection accuracy of 94.26% and F1-score of 84.39% in model deployment setting. To assess robustness, we further evaluate our models using adversarial phishing samples generated by four attack methods in Python TextAttack framework. Additionally, we evaluate models' performance against phishing emails generated by LLMs including ChatGPT and Llama. Results highlight the resilience of models against evolving AI-powered phishing threats.
翻译:钓鱼攻击依然是网络安全领域的重大威胁,尤其随着能够生成极具迷惑性恶意内容的大型语言模型(LLMs)的出现。早期钓鱼邮件可通过语法错误、拼写错误、错误措辞及格式不一致等特征进行识别,而LLM生成的邮件则语法规范、上下文相关且语言自然。这些技术进步使得钓鱼邮件与合法邮件越来越难以区分,对传统检测机制构成了严峻挑战。面对由LLMs生成或通过对抗性扰动技术篡改的邮件时,传统钓鱼检测系统往往失效。为应对这一挑战,我们提出一种鲁棒的钓鱼邮件检测系统,其核心是增强型文本预处理流程。该流程包含拼写校正与词汇切分,以抵消对抗性篡改并提升检测准确率。我们的方法整合了广泛采用的自然语言处理(NLP)特征提取技术与机器学习算法。我们在包含钓鱼邮件与合法邮件的公开数据集上评估模型,在部署环境中实现了94.26%的检测准确率与84.39%的F1分数。为评估鲁棒性,我们进一步使用Python TextAttack框架中四种攻击方法生成的对抗性钓鱼样本测试模型性能。此外,我们还评估了模型对ChatGPT和Llama等LLM生成钓鱼邮件的检测能力。实验结果凸显了所提模型对持续演进的人工智能驱动钓鱼威胁的强健防御能力。