Leveraging the large body of work devoted in recent years to describe redundancy and synergy in multivariate interactions among random variables, we propose a novel approach to quantify cooperative effects in feature importance, one of the most used techniques for explainable artificial intelligence. In particular, we propose an adaptive version of a well-known metric of feature importance, named Leave One Covariate Out (LOCO), to disentangle high-order effects involving a given input feature in regression problems. LOCO is the reduction of the prediction error when the feature under consideration is added to the set of all the features used for regression. Instead of calculating the LOCO using all the features at hand, as in its standard version, our method searches for the multiplet of features that maximize LOCO and for the one that minimize it. This provides a decomposition of the LOCO as the sum of a two-body component and higher-order components (redundant and synergistic), also highlighting the features that contribute to building these high-order effects alongside the driving feature. We report the application to proton/pion discrimination from simulated detector measures by GEANT.
翻译:近年来,大量研究致力于描述随机变量间多元交互中的冗余与协同效应。基于此,我们提出一种量化特征重要性中协同效应的新方法——特征重要性是可解释人工智能领域最常用的技术之一。具体而言,我们针对回归问题中涉及特定输入特征的高阶效应,提出了著名特征重要性度量方法"留一协变量法"的自适应版本。LOCO衡量的是当所考察特征被添加到用于回归的所有特征集合时预测误差的减少量。与标准版本直接使用所有现有特征计算LOCO不同,我们的方法通过搜索使LOCO最大化的特征多元组和最小化的特征多元组,将LOCO分解为双体分量与高阶分量(冗余与协同)之和,同时揭示与驱动特征共同构建这些高阶效应的相关特征。我们报告了该方法在基于GEANT模拟探测器测量数据实现质子/π子鉴别任务中的应用。