With the growing competition in banking industry, banks are required to follow customer retention strategies while they are trying to increase their market share by acquiring new customers. This study compares the performance of six supervised classification techniques to suggest an efficient model to predict customer churn in banking industry, given 10 demographic and personal attributes from 10000 customers of European banks. The effect of feature selection, class imbalance, and outliers will be discussed for ANN and random forest as the two competing models. As shown, unlike random forest, ANN does not reveal any serious concern regarding overfitting and is also robust to noise. Therefore, ANN structure with five nodes in a single hidden layer is recognized as the best performing classifier.
翻译:随着银行业竞争日益激烈,银行在通过获取新客户扩大市场份额的同时,必须实施客户留存策略。本研究比较了六种监督分类技术的性能,基于欧洲银行10,000名客户的10个人口统计与个人属性数据,提出了一种高效的银行业客户流失预测模型。针对人工神经网络与随机森林这两种竞争模型,将讨论特征选择、类别不平衡和异常值的影响。结果表明,与随机森林不同,人工神经网络未出现明显的过拟合问题,且对噪声具有鲁棒性。因此,在单隐层中包含五个节点的人工神经网络结构被认定为性能最优的分类器。