Retrieval-augmented generation (RAG) ranks passages by semantic similarity to the input, implicitly assuming that semantic similarity is a reliable indication of applicability in downstream tasks. This assumption breaks down when task success depends not on topical relevance but on applying the correct rules, constraints, or procedural guidance. In such settings, the most useful context may be the rule triggered by the input rather than the most semantically similar passage. We propose Task-Aligned Retrieval (TAG), a retrieval framework that replaces similarity-based retrieval with applicability-based rule selection. TAG transforms source documents into traceable condition-action rules, identifies which rules apply to a given input through pairwise LLM judgments, and generates the output conditioned only on the selected actions. We empirically observe that across Wikipedia NPOV rewriting, HumanEval with PEP~8 compliance, and NBA transaction reasoning on RuleArena, TAG consistently outperforms standard RAG, with the largest gains in high-mismatch settings (up to 12.2\%) while reducing retrieved context by up to 93\%. These results suggest that, in rule- and instruction-governed tasks, retrieval should optimize for applicability rather than for semantic similarity alone.
翻译:检索增强生成(RAG)根据与输入的语义相似性对段落进行排序,隐含假设语义相似性是下游任务适用性的可靠指标。当任务成功取决于正确规则、约束或程序性指导的应用而非主题相关性时,这一假设便不再成立。在此类场景中,最有效的上下文可能是输入所触发的规则,而非最语义相似的段落。我们提出任务对齐检索(TAG),这是一种用基于适用性的规则选择取代基于相似性检索的框架。TAG将源文档转化为可追溯的条件-行为规则,通过成对LLM判断识别哪些规则适用于给定输入,并仅基于所选行为生成输出。实验表明,在维基百科NPOV重写、符合PEP 8规范的HumanEval以及RuleArena上的NBA交易推理任务中,TAG始终优于标准RAG,在高不匹配场景下提升幅度最大可达12.2%,同时将检索上下文缩减高达93%。这些结果表明,在规则与指令驱动的任务中,检索应优先优化适用性而非仅依赖语义相似性。