ScamFerret: Detecting Scam Websites Autonomously with Large Language Models - 专知论文

会员服务 ·

0

SCAM · MoDELS · 语言模型化 · 可辨认的 · 模型评估 ·

2025 年 2 月 14 日

ScamFerret: Detecting Scam Websites Autonomously with Large Language Models

翻译：ScamFerret：基于大语言模型的诈骗网站自主检测系统

Hiroki Nakano,Takashi Koide,Daiki Chiba

from arxiv, Accepted for publication at DIMVA 2025

With the rise of sophisticated scam websites that exploit human psychological vulnerabilities, distinguishing between legitimate and scam websites has become increasingly challenging. This paper presents ScamFerret, an innovative agent system employing a large language model (LLM) to autonomously collect and analyze data from a given URL to determine whether it is a scam. Unlike traditional machine learning models that require large datasets and feature engineering, ScamFerret leverages LLMs' natural language understanding to accurately identify scam websites of various types and languages without requiring additional training or fine-tuning. Our evaluation demonstrated that ScamFerret achieves 0.972 accuracy in classifying four scam types in English and 0.993 accuracy in classifying online shopping websites across three different languages, particularly when using GPT-4. Furthermore, we confirmed that ScamFerret collects and analyzes external information such as web content, DNS records, and user reviews as necessary, providing a basis for identifying scam websites from multiple perspectives. These results suggest that LLMs have significant potential in enhancing cybersecurity measures against sophisticated scam websites.

翻译：随着利用人类心理弱点的复杂诈骗网站日益增多，区分合法网站与诈骗网站变得愈发困难。本文提出ScamFerret——一种创新性的智能体系统，该系统利用大语言模型（LLM）自主收集并分析给定URL的数据，以判定其是否为诈骗网站。与传统机器学习模型需要大规模数据集和特征工程不同，ScamFerret借助LLM的自然语言理解能力，无需额外训练或微调即可准确识别多种类型和语言的诈骗网站。我们的评估表明，ScamFerret在英语四类诈骗网站分类中达到0.972准确率，在三语种网购网站分类中达到0.993准确率（尤其在采用GPT-4时）。此外，我们证实ScamFerret能按需收集分析网页内容、DNS记录及用户评价等外部信息，为多维度识别诈骗网站提供依据。这些结果表明，大语言模型在增强针对复杂诈骗网站的网络安全防护措施方面具有显著潜力。

0

相关内容

SCAM

代码分析与操作（SCAM）国际工作会议的目的是将从事与计算机系统源代码的分析和/或操作有关的理论、技术和应用的研究人员和实践者聚集在一起。虽然在更广泛的软件工程界中，人们的注意力都集中在系统开发和演化的其他方面，如规范、设计和需求工程，但源代码是对系统行为的唯一精确描述。因此，对源代码的分析和操作仍然是一个紧迫的问题。官网链接：http://www.ieee-scam.org/

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

“Fishes-in-net” 酵母孢子微胶囊式近平滑假丝酵母SCRII酶有机相高效手性合成机制研究

国家自然科学基金

3+阅读 · 2016年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

‘蜜脆’×‘秦冠’苹果 F1 代果实香气物质的遗传分析与 QTL 定位

国家自然科学基金

0+阅读 · 2015年12月31日

Volterra积分微分方程的多区间Chebyshev和Legendre谱配置法

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

关于 Finsler 流形上调和映射与 Laplacian 的若干问题研究

国家自然科学基金

1+阅读 · 2014年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

Neolaxiflorin B的全合成研究

国家自然科学基金

0+阅读 · 2014年12月31日

Generative Agents: Interactive Simulacra of Human Behavior

Arxiv

16+阅读 · 2023年8月6日

Reasoning with Language Model Prompting: A Survey

Arxiv

10+阅读 · 2023年5月4日

Towards Reasoning in Large Language Models: A Survey

Arxiv

34+阅读 · 2022年12月20日

Visual Attention Methods in Deep Learning: An In-Depth Survey

Arxiv

45+阅读 · 2022年4月16日

EDTER: Edge Detection with Transformer

Arxiv

11+阅读 · 2022年3月16日

Transformers in Medical Imaging: A Survey

Arxiv

15+阅读 · 2022年1月24日

CausalVAE: Disentangled Representation Learning via Neural Structural Causal Models

Arxiv

17+阅读 · 2021年3月23日

Continual Lifelong Learning with Neural Networks: A Review

Arxiv

14+阅读 · 2019年2月11日

Occupancy Networks: Learning 3D Reconstruction in Function Space

Occupancy Networks: Learning 3D Reconstruction in Function Space

Arxiv

10+阅读 · 2018年12月10日

IEOPF: An Active Contour Model for Image Segmentation with Inhomogeneities Estimated by Orthogonal Primary Functions

Arxiv

10+阅读 · 2018年1月20日

VIP会员

文章信息

相关主题

语言模型化

最新内容

《美陆军条例：陆军指挥政策（2026版）》

《美陆军条例：陆军指挥政策（2026版）》

专知会员服务

5+阅读 · 今天8:10

《提升美军全域城市作战训练最佳实践的案例研究》366页

《提升美军全域城市作战训练最佳实践的案例研究》366页

专知会员服务

6+阅读 · 今天8:06

《军用自主人工智能系统的治理与安全》

《军用自主人工智能系统的治理与安全》

专知会员服务

4+阅读 · 今天8:02

美海军数字作战负责人：如何利用数据快速生成战斗力

美海军数字作战负责人：如何利用数据快速生成战斗力

专知会员服务

3+阅读 · 今天7:32

《COOL模型（行动循环圈）：军事领导体系中的战役层级变革流程》

《COOL模型（行动循环圈）：军事领导体系中的战役层级变革流程》

专知会员服务

10+阅读 · 4月20日

《系统簇式多域作战规划范畴论框架》

《系统簇式多域作战规划范畴论框架》

专知会员服务

7+阅读 · 4月20日

《美国防部指令6130.03，第2卷服役医疗标准：保留》

《美国防部指令6130.03，第2卷服役医疗标准：保留》

专知会员服务

5+阅读 · 4月20日

《美国防部指令6130.03，第1卷服役医疗标准：任命、征募或征召》

《美国防部指令6130.03，第1卷服役医疗标准：任命、征募或征召》

专知会员服务

3+阅读 · 4月20日

美空军“战场机载通信节点（BACN）”：美以对伊空战行动中隐形却关键的一环

美空军“战场机载通信节点（BACN）”：美以对伊空战行动中隐形却关键的一环

专知会员服务

7+阅读 · 4月20日

【CMU博士论文】面向非结构化环境下医疗急救的具身人工智能

【CMU博士论文】面向非结构化环境下医疗急救的具身人工智能

专知会员服务

3+阅读 · 4月20日

高效视频扩散模型：进展与挑战

高效视频扩散模型：进展与挑战

专知会员服务

3+阅读 · 4月20日

乌克兰前线的五项创新

乌克兰前线的五项创新

专知会员服务

7+阅读 · 4月20日

军事通信系统与设备的技术演进综述

军事通信系统与设备的技术演进综述

专知会员服务

6+阅读 · 4月20日

《北约 AI手册：作战人员的实用考量》（2026最新64页）

《北约 AI手册：作战人员的实用考量》（2026最新64页）

专知会员服务

11+阅读 · 4月20日

《北约标准：医疗评估手册》174页

《北约标准：医疗评估手册》174页

专知会员服务

5+阅读 · 4月20日

相关VIP内容

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《提升美军全域城市作战训练最佳实践的案例研究》366页

美海军数字作战负责人：如何利用数据快速生成战斗力

《美陆军条例：陆军指挥政策（2026版）》

《军用自主人工智能系统的治理与安全》

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

相关论文

Generative Agents: Interactive Simulacra of Human Behavior

Arxiv

16+阅读 · 2023年8月6日

Reasoning with Language Model Prompting: A Survey

Arxiv

10+阅读 · 2023年5月4日

Towards Reasoning in Large Language Models: A Survey

Arxiv

34+阅读 · 2022年12月20日

Visual Attention Methods in Deep Learning: An In-Depth Survey

Arxiv

45+阅读 · 2022年4月16日

EDTER: Edge Detection with Transformer

Arxiv

11+阅读 · 2022年3月16日

Transformers in Medical Imaging: A Survey

Arxiv

15+阅读 · 2022年1月24日

CausalVAE: Disentangled Representation Learning via Neural Structural Causal Models

Arxiv

17+阅读 · 2021年3月23日

Continual Lifelong Learning with Neural Networks: A Review

Arxiv

14+阅读 · 2019年2月11日

Occupancy Networks: Learning 3D Reconstruction in Function Space

Occupancy Networks: Learning 3D Reconstruction in Function Space

Arxiv

10+阅读 · 2018年12月10日

IEOPF: An Active Contour Model for Image Segmentation with Inhomogeneities Estimated by Orthogonal Primary Functions

Arxiv

10+阅读 · 2018年1月20日

相关基金

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

“Fishes-in-net” 酵母孢子微胶囊式近平滑假丝酵母SCRII酶有机相高效手性合成机制研究

国家自然科学基金

3+阅读 · 2016年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

‘蜜脆’×‘秦冠’苹果 F1 代果实香气物质的遗传分析与 QTL 定位

国家自然科学基金

0+阅读 · 2015年12月31日

Volterra积分微分方程的多区间Chebyshev和Legendre谱配置法

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

关于 Finsler 流形上调和映射与 Laplacian 的若干问题研究

国家自然科学基金

1+阅读 · 2014年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

Neolaxiflorin B的全合成研究

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员