Information retrieval systems have traditionally optimized for topical relevance-the degree to which retrieved documents match a query. However, relevance only approximates a deeper goal: utility, namely, whether retrieved information helps accomplish a user's underlying task. The emergence of retrieval-augmented generation (RAG) fundamentally changes this paradigm. Retrieved documents are no longer consumed directly by users but instead serve as evidence for large language models (LLMs) that produce answers. As a result, retrieval effectiveness must be evaluated by its contribution to generation quality rather than by relevance-based ranking metrics alone. This tutorial argues that retrieval objectives are evolving from relevance-centric optimization toward LLM-centric utility. We present a unified framework covering LLM-agnostic versus LLM-specific utility, context-independent versus context-dependent utility, and the connection with LLM information needs and agentic RAG. By synthesizing recent advances, the tutorial provides conceptual foundations and practical guidance for designing retrieval systems aligned with the requirements of LLM-based information access.
翻译:信息检索系统传统上以主题相关性为优化目标,即检索文档与查询的匹配程度。然而,相关性仅是对更深层目标——效用的近似,即检索信息是否有助于完成用户潜在任务。检索增强生成(RAG)的出现从根本上改变了这一范式:检索文档不再直接被用户消费,而是作为大语言模型(LLMs)生成答案的证据。因此,检索效果必须通过其对生成质量的贡献来评估,而非仅依赖基于相关性的排序指标。本教程提出检索目标正从以相关性为中心的优化向以LLM为中心的效用演变。我们构建了一个统一框架,涵盖LLM无关与LLM特定效用、上下文无关与上下文依赖效用,及其与LLM信息需求和智能体RAG的关联。通过综合最新进展,本教程为设计适配LLM信息访问需求的检索系统提供了概念基础与实践指导。