Generative retrieval (GR) is an emerging paradigm that leverages large language models (LLMs) to autoregressively generate document identifiers (docids) relevant to a given query. Prior works have focused on leveraging the generative capabilities of LLMs to improve GR, while overlooking that their reasoning capabilities could likewise help. This raises a key question: Can explicit reasoning benefit GR? To investigate, we first conduct a preliminary study where an LLM is prompted to generate free-form chain-of-thought (CoT) reasoning before performing constrained docid decoding. Although this method outperforms standard GR, the generated reasoning tends to be verbose and poorly aligned with the docid space. These limitations motivate the development of a reasoning mechanism better tailored to GR. Therefore, we propose Reason-for-Retrieval (R4R), a reasoning-augmented framework for GR that converts free-form CoT reasoning into a compact, structured format, and iteratively refines the reasoning during the retrieval process. R4R augments an existing GR method by leveraging a reasoning-capable LLM that has been instruction-tuned for GR. At inference time, R4R first uses the LLM to generate an initial structured reasoning; then the same LLM alternates between (i) constrained decoding with the chosen GR method to produce candidate docids and (ii) updating the reasoning based on retrieval results to improve the next round. R4R does not require additional models or training, and instead a single LLM serves as both the reasoning generator and the retriever. Extensive experiments on Natural Questions, MS MARCO, and a real-world item-search benchmark validate the effectiveness of R4R.
翻译:生成式检索(GR)是一种新兴范式,它利用大型语言模型(LLM)以自回归方式生成与给定查询相关的文档标识符(docid)。先前的研究主要关注利用LLM的生成能力来改进GR,而忽视了其推理能力同样可能发挥作用。这引出了一个关键问题:显式推理能否有益于GR?为探究此问题,我们首先进行了一项初步研究,其中提示LLM在执行受限docid解码之前生成自由形式的思维链(CoT)推理。尽管该方法优于标准GR,但生成的推理往往冗长且与docid空间对齐不佳。这些局限性促使我们开发一种更适配GR的推理机制。因此,我们提出了Reason-for-Retrieval(R4R),这是一个用于GR的推理增强框架,它将自由形式的CoT推理转换为紧凑的结构化格式,并在检索过程中迭代优化推理。R4R通过利用一个经过指令微调以适用于GR的具备推理能力的LLM,来增强现有的GR方法。在推理阶段,R4R首先使用LLM生成初始的结构化推理;然后,同一个LLM交替执行(i)使用选定的GR方法进行受限解码以生成候选docid,以及(ii)根据检索结果更新推理以改进下一轮检索。R4R不需要额外的模型或训练,而是由单个LLM同时充当推理生成器和检索器。在Natural Questions、MS MARCO以及一个真实世界商品搜索基准上的大量实验验证了R4R的有效性。