An Efficient Machine Learning-based Framework for Detection and Prevention of Frauds in Telecom Networks

from arxiv, Peer-reviewed and presented at 2025 International Conference on Advancement in Communication and Computing Technology (INOACC-2025); self-published by the author due to a sustained 13-month indexing delay by the organizers. Contains 7 pages and 7 figures

Telecommunication fraud is an acute problem that leads to substantial material losses and compromises the reliability of telecom systems worldwide. Only effective and efficient detection mechanisms can help to deal with these threats, though there are certain shifts in the approaches to fraud detection. This paper evaluates the performance of AI-driven models for fraud detection in telecommunication networks using Call Detail Record (CDR) datasets. This study focuses on fraud detection in telecom networks using the Telecom CDR dataset, which contains 101,174 customer records with 17 attributes, including 8,830 fraud cases. In feature preprocessing, missing values were dealt with, followed by data scaling using Min-Max scaling and data balancing using the SMOTE technique. The dataset was trained for predictive analysis using Random Forest (RF) and XGBoost models. F1-score, ROC AUC, recall, accuracy, time, and precision were used as indicators with which to compare performance of the two models. RF recorded a high level of accuracy at 99.9% while XGBoost at 99.7%. Findings show that the suggested framework successfully detects fraud with few misclassifications. Several machine learning models were evaluated and contrasted, such as RF, XGBoost, DBSCAN, RoBERTa, and K-means. Among all the models, RF was seen to give the highest performance with an accuracy of 99.9% and precision of 99.9%, recall of 99.9% and F1-score of 99.9%, XGBoost, GNN and BERT. The findings emphasize RF as the most effective model for detecting fraudulent activities in telecom networks, ensuring robust and reliable prevention of fraud.

翻译：电信欺诈是一个严重问题，会导致重大物质损失并损害全球电信系统的可靠性。只有有效且高效的检测机制才能应对这些威胁，尽管欺诈检测方法已出现一定转变。本文利用通话详单（CDR）数据集评估了人工智能驱动模型在电信网络欺诈检测中的性能。本研究聚焦于使用包含101,174条客户记录（含17个属性，其中8,830条为欺诈案例）的电信CDR数据集进行欺诈检测。在特征预处理阶段，处理缺失值后，采用Min-Max缩放进行数据归一化，并运用SMOTE技术实现数据平衡。利用随机森林（RF）和XGBoost模型对该数据集进行训练以开展预测分析。采用F1分数、ROC AUC、召回率、准确率、时效性和精确率作为指标比较两个模型的性能。RF达到99.9%的高准确率，XGBoost为99.7%。研究结果表明，所提出的框架能以极低的误分类率成功检测欺诈。本研究还对多种机器学习模型进行了评估与对比，包括RF、XGBoost、DBSCAN、RoBERTa和K-means。在所有模型中，RF表现最优，准确率、精确率、召回率和F1分数均达99.9%，优于XGBoost、GNN和BERT。研究结果强调，RF是检测电信网络欺诈活动最有效的模型，能够确保稳健可靠的欺诈预防能力。

相关内容

MoDELS

关注 46

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

《人工智能在网络防御中的机遇》

专知会员服务

12+阅读 · 6月8日

《基于深度学习的软件定义网络模型用于物联网网络威胁检测》

专知会员服务

12+阅读 · 3月16日

DGP双粒度提示框架：图增强大模型助力欺诈检测

专知会员服务

9+阅读 · 2025年8月17日

图神经网络在金融欺诈检测中的应用综述

专知会员服务

28+阅读 · 2024年11月22日