Detecting Phishing Sites Using ChatGPT

The rise of large language models (LLMs) has had a significant impact on various domains, including natural language processing and artificial intelligence. While LLMs such as ChatGPT have been extensively researched for tasks such as code generation and text synthesis, their application in detecting malicious web content, particularly phishing sites, has been largely unexplored. To combat the rising tide of automated cyber attacks facilitated by LLMs, it is imperative to automate the detection of malicious web content, which requires approaches that leverage the power of LLMs to analyze and classify phishing sites. In this paper, we propose a novel method that utilizes ChatGPT to detect phishing sites. Our approach involves leveraging a web crawler to gather information from websites and generate prompts based on this collected data. This approach enables us to detect various phishing sites without the need for fine-tuning machine learning models and identify social engineering techniques from the context of entire websites and URLs. To evaluate the performance of our proposed method, we conducted experiments using a dataset. The experimental results using GPT-4 demonstrated promising performance, with a precision of 98.3% and a recall of 98.4%. Comparative analysis between GPT-3.5 and GPT-4 revealed an enhancement in the latter's capability to reduce false negatives. These findings not only highlight the potential of LLMs in efficiently identifying phishing sites but also have significant implications for enhancing cybersecurity measures and protecting users from the dangers of online fraudulent activities.

翻译：大语言模型（LLMs）的兴起对自然语言处理和人工智能等多个领域产生了深远影响。尽管ChatGPT等LLMs在代码生成和文本合成等任务中已得到广泛研究，但其在检测恶意网络内容（特别是钓鱼网站）方面的应用仍鲜有探索。为应对由LLMs助长的自动化网络攻击日益增多的趋势，必须实现恶意网络内容检测的自动化，这需要借助LLMs的力量对钓鱼网站进行分析与分类的方法。本文提出了一种利用ChatGPT检测钓鱼网站的新方法。该方法通过网络爬虫从网站收集信息，并基于这些数据生成提示词。这种方法无需微调机器学习模型即可检测各类钓鱼网站，并能从整个网站和URL的语境中识别社会工程攻击技术。为评估所提方法的性能，我们使用一个数据集进行了实验。基于GPT-4的实验结果展现了优异的性能，精确率达98.3%，召回率达98.4%。GPT-3.5与GPT-4的对比分析显示，后者在减少假阴性方面能力显著提升。这些发现不仅突显了LLMs在高效识别钓鱼网站方面的潜力，还对加强网络安全措施、保护用户免受在线欺诈活动危害具有重要启示。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

最新《Transformers模型》教程，64页ppt

专知会员服务

326+阅读 · 2020年11月26日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

55+阅读 · 2020年9月7日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日