Feature attributions are a commonly used explanation type, when we want to posthoc explain the prediction of a trained model. Yet, they are not very well explored in IR. Importantly, feature attribution has rarely been rigorously defined, beyond attributing the most important feature the highest value. What it means for a feature to be more important than others is often left vague. Consequently, most approaches focus on just selecting the most important features and under utilize or even ignore the relative importance within features. In this work, we rigorously define the notion of feature attribution for ranking models, and list essential properties that a valid attribution should have. We then propose RankingSHAP as a concrete instantiation of a list-wise ranking attribution method. Contrary to current explanation evaluation schemes that focus on selections, we propose two novel evaluation paradigms for evaluating attributions over learning-to-rank models. We evaluate RankingSHAP for commonly used learning-to-rank datasets to showcase the more nuanced use of an attribution method while highlighting the limitations of selection-based explanations. In a simulated experiment we design an interpretable model to demonstrate how list-wise ranking attributes can be used to investigate model decisions and evaluate the explanations qualitatively. Because of the contrastive nature of the ranking task, our understanding of ranking model decisions can substantially benefit from feature attribution explanations like RankingSHAP.
翻译:特征归因是一种常用的解释类型,用于对训练模型的预测进行事后解释。然而,这类方法在信息检索领域尚未得到充分探索。重要的是,除将最重要特征赋予最高值外,特征归因鲜有严格定义。特征相比其他特征更重要的含义往往模糊不清。因此,多数方法仅聚焦于选取最重要的特征,低估甚至忽略了特征间的相对重要性。本研究严格定义了排序模型中特征归因的概念,并列举了有效归因应具备的基本性质。我们进而提出RankingSHAP,作为列表式排序归因方法的具体实现。不同于当前聚焦于选取的解释评估方案,我们提出两种新型评估范式,用于评估学习排序模型上的归因效果。使用常见学习排序数据集对RankingSHAP进行评估,以展示归因方法更细致的应用,同时凸显基于选取的解释方法的局限性。在模拟实验中,我们设计了一个可解释模型,用以演示如何利用列表式排序归因探究模型决策,并对解释结果进行定性评估。由于排序任务具有对比性质,对排序模型决策的理解可极大受益于像RankingSHAP这样的特征归因解释。