The development of computing has made credit scoring approaches possible, with various machine learning (ML) and deep learning (DL) techniques becoming more and more valuable. While complex models yield more accurate predictions, their interpretability is often weakened, which is a concern for credit scoring that places importance on decision fairness. As features of the dataset are a crucial factor for the credit scoring system, we implement Linear Discriminant Analysis (LDA) as a feature reduction technique, which reduces the burden of the models complexity. We compared 6 different machine learning models, 1 deep learning model, and a hybrid model with and without using LDA. From the result, we have found our hybrid model, XG-DNN, outperformed other models with the highest accuracy of 99.45% and a 99% F1 score with LDA. Lastly, to interpret model decisions, we have applied 2 different explainable AI techniques named LIME (local) and Morris Sensitivity Analysis (global). Through this research, we showed how feature reduction techniques can be used without affecting the performance and explainability of the model, which can be very useful in resource-constrained settings to optimize the computational workload.
翻译:计算技术的发展使信用评分方法成为可能,各种机器学习(ML)和深度学习(DL)技术变得越来越有价值。虽然复杂模型能产生更准确的预测,但其可解释性往往被削弱,这对于重视决策公平性的信用评分而言是一个值得关注的问题。由于数据集的特征是信用评分系统的关键因素,我们采用线性判别分析(LDA)作为特征降维技术,以减轻模型复杂度的负担。我们比较了6种不同的机器学习模型、1种深度学习模型以及使用与不使用LDA的混合模型。结果表明,我们的混合模型XG-DNN在使用LDA的情况下表现优于其他模型,获得了99.45%的最高准确率和99%的F1分数。最后,为解释模型决策,我们应用了两种不同的可解释人工智能技术:LIME(局部)和Morris敏感性分析(全局)。通过本研究,我们展示了如何在保持模型性能和可解释性的前提下使用特征降维技术,这在资源受限的环境中对于优化计算负载非常有用。