Shapley values have been used extensively in machine learning, not only to explain black box machine learning models, but among other tasks, also to conduct model debugging, sensitivity and fairness analyses and to select important features for robust modelling and for further follow-up analyses. Shapley values satisfy certain axioms that promote fairness in distributing contributions of features toward prediction or reducing error, after accounting for non-linear relationships and interactions when complex machine learning models are employed. Recently, a number of feature selection methods utilising Shapley values have been introduced. Here, we present a novel feature selection method, LLpowershap, which makes use of loss-based Shapley values to identify informative features with minimal noise among the selected sets of features. Our simulation results show that LLpowershap not only identifies higher number of informative features but outputs fewer noise features compared to other state-of-the-art feature selection methods. Benchmarking results on four real-world datasets demonstrate higher or at par predictive performance of LLpowershap compared to other Shapley based wrapper methods, or filter methods.
翻译:Shapley值已广泛应用于机器学习领域,不仅用于解释黑箱机器学习模型,还承担着模型调试、敏感性与公平性分析,以及为稳健建模和后续分析选择重要特征等任务。Shapley值满足特定公理,在考虑复杂机器学习模型中的非线性关系与交互作用后,能够促进特征对预测或误差减少贡献分配的公平性。近年来,涌现出多种利用Shapley值的特征选择方法。本文提出一种新颖的特征选择方法LLpowershap,该方法利用基于损失的Shapley值,从选定特征集中以最小噪声识别信息性特征。仿真结果表明,与其他最先进的特征选择方法相比,LLpowershap不仅能识别更多信息性特征,且输出的噪声特征更少。在四个真实世界数据集上的基准测试结果显示,相较于其他基于Shapley的包装器方法或过滤方法,LLpowershap具有更高或相当的预测性能。