Most efforts in interpreting neural relevance models have focused on local explanations, which explain the relevance of a document to a query but are not useful in predicting the model's behavior on unseen query-document pairs. We propose a novel method to globally explain neural relevance models by constructing a "relevance thesaurus" containing semantically relevant query and document term pairs. This thesaurus is used to augment lexical matching models such as BM25 to approximate the neural model's predictions. Our method involves training a neural relevance model to score the relevance of partial query and document segments, which is then used to identify relevant terms across the vocabulary space. We evaluate the obtained thesaurus explanation based on ranking effectiveness and fidelity to the target neural ranking model. Notably, our thesaurus reveals the existence of brand name bias in ranking models, demonstrating one advantage of our explanation method.
翻译:大多数解释神经相关性模型的研究集中于局部解释,这类方法能够说明文档与查询之间的相关性,但无法有效预测模型在未见查询-文档对上的行为。我们提出一种新颖的全局解释方法,通过构建包含语义相关查询项与文档项对的"相关性词典"来解释神经相关性模型。该词典用于增强BM25等词汇匹配模型,以逼近神经模型的预测结果。我们的方法首先训练神经相关性模型对部分查询与文档片段进行相关性评分,继而利用该模型在词汇空间中识别相关词汇。我们通过排序效果及对目标神经排序模型的忠实度两个维度评估所获词典解释的有效性。值得注意的是,我们的词典揭示了排序模型中存在的品牌名称偏差现象,这证明了本解释方法的优势所在。