Beyond Content Relevance: Evaluating Instruction Following in Retrieval Models

Instruction-following capabilities in LLMs have progressed significantly, enabling more complex user interactions through detailed prompts. However, retrieval systems have not matched these advances, most of them still relies on traditional lexical and semantic matching techniques that fail to fully capture user intent. Recent efforts have introduced instruction-aware retrieval models, but these primarily focus on intrinsic content relevance, which neglects the importance of customized preferences for broader document-level attributes. This study evaluates the instruction-following capabilities of various retrieval models beyond content relevance, including LLM-based dense retrieval and reranking models. We develop InfoSearch, a novel retrieval evaluation benchmark spanning six document-level attributes: Audience, Keyword, Format, Language, Length, and Source, and introduce novel metrics -- Strict Instruction Compliance Ratio (SICR) and Weighted Instruction Sensitivity Evaluation (WISE) to accurately assess the models' responsiveness to instructions. Our findings indicate that although fine-tuning models on instruction-aware retrieval datasets and increasing model size enhance performance, most models still fall short of instruction compliance.

翻译：大型语言模型（LLM）的指令遵循能力已取得显著进展，使得通过详细提示实现更复杂的用户交互成为可能。然而，检索系统尚未跟上这些进步，大多数仍依赖传统的词汇和语义匹配技术，无法充分捕捉用户意图。近期研究引入了指令感知检索模型，但这些模型主要关注内在内容相关性，忽视了针对更广泛文档级属性的定制化偏好的重要性。本研究评估了各种检索模型在内容相关性之外的指令遵循能力，包括基于LLM的稠密检索和重排序模型。我们开发了InfoSearch——一个涵盖六个文档级属性（受众、关键词、格式、语言、长度和来源）的新型检索评估基准，并引入了严格指令遵循率（SICR）和加权指令敏感度评估（WISE）两项新指标，以准确评估模型对指令的响应能力。我们的研究结果表明，尽管在指令感知检索数据集上微调模型及增加模型规模能提升性能，但大多数模型仍未能充分遵循指令。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/