Global variable importance measures are commonly used to interpret the results of machine learning models. Local variable importance techniques assess how variables contribute to individual observations. Current, popular methods, including LIME and SHAP, provide useful measures of feature contribution in the prediction space, while leaving opportunities for improved characterization of local structure in the model loss space. Additionally, they are not natively adapted for multi-class classification problems. We propose a new model-agnostic method for calculating local variable importance, CLIQUE, that highlights locally dependent relationships, provides improved stability over permutation-based methods, and can be directly applied to multi-class classification problems. Simulated and real-world examples show that CLIQUE emphasizes locally dependent information, captures interaction behavior beyond what can be evaluated by correlations, and assigns zero importance in regions where the response is invariant to changes in variables.
翻译:全局变量重要性度量常用于解释机器学习模型的结果,而局部变量重要性技术则评估变量对单个观测值的贡献。当前主流方法如LIME和SHAP虽能在预测空间中提供有效的特征贡献度量,但在模型损失空间中改进局部结构表征方面仍有空间。此外,这些方法天然不适用于多类别分类问题。我们提出了一种新的模型无关的局部变量重要性计算方法——CLIQUE,该方法能突出局部依赖关系,相比基于置换的方法具有更好的稳定性,并可直接应用于多类别分类问题。模拟实验和实际案例表明,CLIQUE能够强调局部依赖信息,捕捉相关性评估范围之外的交互行为,并在响应变量对变量变化不敏感的区域赋予零重要性。