Cyber attacks continue to pose significant threats to individuals and organizations, stealing sensitive data such as personally identifiable information, financial information, and login credentials. Hence, detecting malicious websites before they cause any harm is critical to preventing fraud and monetary loss. To address the increasing number of phishing attacks, protective mechanisms must be highly responsive, adaptive, and scalable. Fortunately, advances in the field of machine learning, coupled with access to vast amounts of data, have led to the adoption of various deep learning models for timely detection of these cyber crimes. This study focuses on the detection of phishing websites using deep learning models such as Multi-Head Attention, Temporal Convolutional Network (TCN), BI-LSTM, and LSTM where URLs of the phishing websites are treated as a sequence. The results demonstrate that Multi-Head Attention and BI-LSTM model outperform some other deep learning-based algorithms such as TCN and LSTM in producing better precision, recall, and F1-scores.
翻译:网络攻击持续对个人和组织构成重大威胁,窃取个人身份信息、财务信息和登录凭证等敏感数据。因此,在恶意网站造成危害之前对其进行检测,对于防止欺诈和资金损失至关重要。为应对日益增多的钓鱼攻击,防护机制必须具备高度响应性、适应性和可扩展性。幸运的是,机器学习领域的进步以及海量数据的可获取性,促使各类深度学习模型被用于及时检测这些网络犯罪。本研究聚焦于使用多头注意力机制、时序卷积网络(TCN)、双向长短期记忆网络(BI-LSTM)和长短期记忆网络(LSTM)等深度学习模型检测钓鱼网站,其中将钓鱼网站的URL视为序列进行处理。结果表明,多头注意力机制和BI-LSTM模型在精确率、召回率和F1分数方面优于TCN和LSTM等其他基于深度学习的算法。