Retrieval-Augmented Language Models boost task performance, owing to the retriever that provides external knowledge. Although crucial, the retriever primarily focuses on semantics relevance, which may not always be effective for generation. Thus, utility-based retrieval has emerged as a promising topic, prioritizing passages that provide valid benefits for downstream tasks. However, due to insufficient understanding, capturing passage utility accurately remains unexplored. This work proposes SCARLet, a framework for training utility-based retrievers in RALMs, which incorporates two key factors, multi-task generalization and inter-passage interaction. First, SCARLet constructs shared context on which training data for various tasks is synthesized. This mitigates semantic bias from context differences, allowing retrievers to focus on learning task-specific utility and generalize across tasks. Next, SCARLet uses a perturbation-based attribution method to estimate passage-level utility for shared context, which reflects interactions between passages and provides more accurate feedback. We evaluate our approach on ten datasets across various tasks, both in-domain and out-of-domain, showing that retrievers trained by SCARLet consistently improve the overall performance of RALMs.
翻译:检索增强语言模型通过检索器提供外部知识来提升任务性能。尽管检索器至关重要,但其主要关注语义相关性,这并不总能有效支持文本生成。因此,效用检索已成为一个前景广阔的研究方向,其优先选择能为下游任务提供有效增益的文本段落。然而,由于理解不足,如何准确捕捉段落效用仍待探索。本研究提出SCARLet框架,用于在检索增强语言模型中训练效用检索器,该框架融合了多任务泛化与段落间交互两个关键要素。首先,SCARLet构建共享上下文,并基于此合成多任务训练数据。这缓解了因上下文差异导致的语义偏差,使检索器能专注于学习任务特定效用并实现跨任务泛化。其次,SCARLet采用基于扰动的归因方法评估共享上下文的段落级效用,该方法能反映段落间交互并提供更准确的反馈信号。我们在涵盖领域内与领域外任务的十个数据集上进行评估,结果表明经SCARLet训练的检索器能持续提升检索增强语言模型的整体性能。