Quantifying variable importance is essential for answering high-stakes questions in fields like genetics, public policy, and medicine. Current methods generally calculate variable importance for a given model trained on a given dataset. However, for a given dataset, there may be many models that explain the target outcome equally well; without accounting for all possible explanations, different researchers may arrive at many conflicting yet equally valid conclusions given the same data. Additionally, even when accounting for all possible explanations for a given dataset, these insights may not generalize because not all good explanations are stable across reasonable data perturbations. We propose a new variable importance framework that quantifies the importance of a variable across the set of all good models and is stable across the data distribution. Our framework is extremely flexible and can be integrated with most existing model classes and global variable importance metrics. We demonstrate through experiments that our framework recovers variable importance rankings for complex simulation setups where other methods fail. Further, we show that our framework accurately estimates the true importance of a variable for the underlying data distribution. We provide theoretical guarantees on the consistency and finite sample error rates for our estimator. Finally, we demonstrate its utility with a real-world case study exploring which genes are important for predicting HIV load in persons with HIV, highlighting an important gene that has not previously been studied in connection with HIV. Code is available at https://github.com/jdonnelly36/Rashomon_Importance_Distribution.
翻译:量化变量重要性对于回答遗传学、公共政策及医学等领域的高风险问题至关重要。现有方法通常针对给定数据集上训练的某个特定模型计算变量重要性。然而,对于同一数据集而言,可能存在多个同等解释目标结果的模型;若未能考虑所有可能的解释,不同研究者可能基于相同数据得出相互矛盾却同样有效的结论。此外,即便考虑了给定数据的所有可能解释,这些见解也可能无法泛化,因为并非所有良好解释在合理的数据扰动下都保持稳定。我们提出了一种新的变量重要性框架,该框架可量化变量在所有良好模型集合中的重要性,并在数据分布上保持稳定。该框架极其灵活,能与大多数现有模型类别和全局变量重要性指标相集成。实验表明,在其他方法失败时,我们的框架能为复杂模拟设定恢复变量重要性排序。进一步,我们证明了该框架能准确估计变量在底层数据分布中的真实重要性。我们为估计量提供了一致性和有限样本误差率的理论保证。最后,通过真实案例研究(探索对预测HIV感染者病毒载量至关重要的基因),我们展示了其实用性,并突出了一个此前未在HIV研究中被关联的重要基因。代码详见 https://github.com/jdonnelly36/Rashomon_Importance_Distribution。