Generative retrieval (GR) models encode a corpus within model parameters and generate relevant document identifiers directly for a given query. While this paradigm shows promise in retrieval tasks, existing GR models struggle with complex queries in numerical contexts, such as those involving semantic reasoning over financial reports, due to limited reasoning capabilities. This limitation leads to suboptimal retrieval accuracy and hinders practical applicability. We propose ReasonGR, a framework designed to enhance multi-step semantic reasoning in numerical contexts within GR. ReasonGR employs a structured prompting strategy combining task-specific instructions with stepwise reasoning guidance to better address complex retrieval queries. Additionally, it integrates a reasoning-focused adaptation module to improve the learning of reasoning-related parameters. Experiments on the FinQA dataset, which contains financial queries over complex documents, demonstrate that ReasonGR improves retrieval accuracy and consistency, indicating its potential for advancing GR models in reasoning-intensive retrieval scenarios.
翻译:生成式检索(GR)模型将语料库编码至模型参数中,并直接为给定查询生成相关文档标识符。尽管该范式在检索任务中展现出潜力,但现有GR模型在数值语境下处理复杂查询(例如涉及财务报表的语义推理)时,由于推理能力有限而面临困难。这一局限导致检索准确率欠佳,并阻碍了实际应用。我们提出ReasonGR框架,旨在增强GR在数值语境中的多步语义推理能力。ReasonGR采用结构化提示策略,将任务特定指令与逐步推理指导相结合,以更好地处理复杂检索查询。此外,该框架集成了专注于推理的自适应模块,以优化推理相关参数的学习。在包含复杂文档金融查询的FinQA数据集上的实验表明,ReasonGR显著提升了检索准确率与一致性,证明了其在推理密集型检索场景中推动GR模型发展的潜力。