Telecommunication fraud is an acute problem that leads to substantial material losses and compromises the reliability of telecom systems worldwide. Only effective and efficient detection mechanisms can help to deal with these threats, though there are certain shifts in the approaches to fraud detection. This paper evaluates the performance of AI-driven models for fraud detection in telecommunication networks using Call Detail Record (CDR) datasets. This study focuses on fraud detection in telecom networks using the Telecom CDR dataset, which contains 101,174 customer records with 17 attributes, including 8,830 fraud cases. In feature preprocessing, missing values were dealt with, followed by data scaling using Min-Max scaling and data balancing using the SMOTE technique. The dataset was trained for predictive analysis using Random Forest (RF) and XGBoost models. F1-score, ROC AUC, recall, accuracy, time, and precision were used as indicators with which to compare performance of the two models. RF recorded a high level of accuracy at 99.9% while XGBoost at 99.7%. Findings show that the suggested framework successfully detects fraud with few misclassifications. Several machine learning models were evaluated and contrasted, such as RF, XGBoost, DBSCAN, RoBERTa, and K-means. Among all the models, RF was seen to give the highest performance with an accuracy of 99.9% and precision of 99.9%, recall of 99.9% and F1-score of 99.9%, XGBoost, GNN and BERT. The findings emphasize RF as the most effective model for detecting fraudulent activities in telecom networks, ensuring robust and reliable prevention of fraud.
翻译:电信欺诈是一个严重问题,会导致重大物质损失并损害全球电信系统的可靠性。只有有效且高效的检测机制才能应对这些威胁,尽管欺诈检测方法已出现一定转变。本文利用通话详单(CDR)数据集评估了人工智能驱动模型在电信网络欺诈检测中的性能。本研究聚焦于使用包含101,174条客户记录(含17个属性,其中8,830条为欺诈案例)的电信CDR数据集进行欺诈检测。在特征预处理阶段,处理缺失值后,采用Min-Max缩放进行数据归一化,并运用SMOTE技术实现数据平衡。利用随机森林(RF)和XGBoost模型对该数据集进行训练以开展预测分析。采用F1分数、ROC AUC、召回率、准确率、时效性和精确率作为指标比较两个模型的性能。RF达到99.9%的高准确率,XGBoost为99.7%。研究结果表明,所提出的框架能以极低的误分类率成功检测欺诈。本研究还对多种机器学习模型进行了评估与对比,包括RF、XGBoost、DBSCAN、RoBERTa和K-means。在所有模型中,RF表现最优,准确率、精确率、召回率和F1分数均达99.9%,优于XGBoost、GNN和BERT。研究结果强调,RF是检测电信网络欺诈活动最有效的模型,能够确保稳健可靠的欺诈预防能力。