Modern deterministic retrieval pipelines prioritize achieving state-of-the-art performance but often lack interpretability in decision-making. These models face challenges in assessing uncertainty, leading to overconfident predictions. To overcome these limitations, we integrate uncertainty calibration and interpretability into a retrieval pipeline. Specifically, we introduce Bayesian methodologies and multi-perspective retrieval to calibrate uncertainty within a retrieval pipeline. We incorporate techniques such as LIME and SHAP to analyze the behavior of a black-box reranker model. The importance scores derived from these explanation methodologies serve as supplementary relevance scores to enhance the base reranker model. We evaluate the resulting performance enhancements achieved through uncertainty calibration and interpretable reranking on Question Answering and Fact Checking tasks. Our methods demonstrate substantial performance improvements across three KILT datasets.
翻译:现代确定性检索管道优先追求最先进的性能,但往往缺乏决策过程中的可解释性。这些模型在评估不确定性方面面临挑战,导致产生过度自信的预测。为克服这些局限,我们将不确定性校准与可解释性整合到检索管道中。具体而言,我们引入贝叶斯方法论与多视角检索机制,以实现检索管道内的不确定性校准。我们采用LIME和SHAP等技术分析黑箱重排序模型的行为。这些解释方法得出的重要性分数,可作为补充相关性分数以增强基础重排序模型。我们评估了通过不确定性校准与可解释重排序在问答与事实验证任务中取得的性能提升效果。我们的方法在三个KILT数据集上展现了显著的性能改进。