Developing interpretable machine learning models has become an increasingly important issue. One way in which data scientists have been able to develop interpretable models has been to use dimension reduction techniques. In this paper, we examine several dimension reduction techniques including two recent approaches developed in the network psychometrics literature called exploratory graph analysis (EGA) and unique variable analysis (UVA). We compared EGA and UVA with two other dimension reduction techniques common in the machine learning literature (principal component analysis and independent component analysis) as well as no reduction to the variables real data. We show that EGA and UVA perform as well as the other reduction techniques or no reduction. Consistent with previous literature, we show that dimension reduction can decrease, increase, or provide the same accuracy as no reduction of variables. Our tentative results find that dimension reduction tends to lead to better performance when used for classification tasks.
翻译:开发可解释的机器学习模型已成为日益重要的议题。数据科学家构建可解释模型的一种途径是采用降维技术。本文研究了多种降维方法,包括近期网络心理测量学文献中提出的两种新方法:探索性图分析(EGA)和唯一变量分析(UVA)。我们将EGA和UVA与机器学习文献中常见的另外两种降维技术(主成分分析和独立成分分析)以及不对变量进行降维处理的实际数据进行了比较。研究表明,EGA和UVA的性能与其他降维技术或未经降维的处理效果相当。与既往文献一致,我们发现降维可能降低、提升或保持与未降维处理相同的准确率。初步结果表明,在分类任务中采用降维往往能获得更优的模型性能。