Phishing websites remain a significant cybersecurity threat, necessitating accurate and cost-effective detection mechanisms. In this paper, we present CLASP, a novel system that effectively identifies phishing websites by leveraging multiple intelligent agents, built using large language models (LLMs), to analyze different aspects of a web resource. The system processes URLs or QR codes, employing specialized LLM-based agents that evaluate the URL structure, webpage screenshot, and HTML content to predict potential phishing threats. To optimize performance while minimizing operational costs, we experimented with multiple combination strategies for agent-based analysis, ultimately designing a strategic combination that ensures the per-website evaluation expense remains minimal without compromising detection accuracy. We tested various LLMs, including Gemini 1.5 Flash and GPT-4o mini, to build these agents and found that Gemini 1.5 Flash achieved the best performance with an F1 score of 83.01% on a newly curated dataset. Also, the system maintained an average processing time of 2.78 seconds per website and an API cost of around $3.18 per 1,000 websites. Moreover, CLASP surpasses leading previous solutions, achieving over 40% higher recall and a 20% improvement in F1 score for phishing detection on the collected dataset. To support further research, we have made our dataset publicly available, supporting the development of more advanced phishing detection systems.
翻译:钓鱼网站仍然是网络安全领域的重大威胁,亟需准确且经济高效的检测机制。本文提出CLASP,一种通过利用多个基于大型语言模型构建的智能体来分析网络资源不同方面,从而有效识别钓鱼网站的新型系统。该系统处理URL或二维码,采用专门的基于LLM的智能体来评估URL结构、网页截图和HTML内容,以预测潜在的钓鱼威胁。为在优化性能的同时最小化运营成本,我们尝试了多种基于智能体的分析组合策略,最终设计出一种策略性组合方案,确保在不影响检测准确性的前提下,使单网站评估成本保持最低。我们测试了包括Gemini 1.5 Flash和GPT-4o mini在内的多种LLM来构建这些智能体,发现Gemini 1.5 Flash在新构建的数据集上取得了最佳性能,F1分数达到83.01%。此外,系统保持了平均每网站2.78秒的处理时间,以及每千次网站检测约3.18美元的API成本。更重要的是,CLASP超越了以往的主流解决方案,在收集的数据集上实现了钓鱼检测召回率提升超过40%,F1分数提高20%。为支持后续研究,我们已公开本数据集,以促进更先进的钓鱼检测系统的开发。