Mobile Internet user credit assessment is an important way for communication operators to establish decisions and formulate measures, and it is also a guarantee for operators to obtain expected benefits. However, credit evaluation methods have long been monopolized by financial industries such as banks and credit. As supporters and providers of platform network technology and network resources, communication operators are also builders and maintainers of communication networks. Internet data improves the user's credit evaluation strategy. This paper uses the massive data provided by communication operators to carry out research on the operator's user credit evaluation model based on the fusion LightGBM algorithm. First, for the massive data related to user evaluation provided by operators, key features are extracted by data preprocessing and feature engineering methods, and a multi-dimensional feature set with statistical significance is constructed; then, linear regression, decision tree, LightGBM, and other machine learning algorithms build multiple basic models to find the best basic model; finally, integrates Averaging, Voting, Blending, Stacking and other integrated algorithms to refine multiple fusion models, and finally establish the most suitable fusion model for operator user evaluation.
翻译:移动互联网用户信用评估是通信运营商制定决策和措施的重要方式,也是运营商获取预期收益的保障。然而,信用评估方法长期被银行、信贷等金融行业垄断。作为平台网络技术和网络资源的支持者与提供者,通信运营商同时也是通信网络的建设者和维护者。互联网数据提升了用户信用评估策略的有效性。本文利用通信运营商提供的海量数据,开展基于融合LightGBM算法的运营商用户信用评估模型研究。首先,针对运营商提供的与用户评估相关的海量数据,通过数据预处理和特征工程方法提取关键特征,构建具有统计意义的多维特征集;其次,采用线性回归、决策树、LightGBM等机器学习算法构建多个基础模型,以寻找最优基础模型;最后,融合Averaging、Voting、Blending、Stacking等集成算法优化多个融合模型,最终建立最适合运营商用户评估的融合模型。