Generative retrieval (GR) maps queries directly to document identifiers (docids) using parametric knowledge, However, this design makes corpus expansion costly: adding new documents requires updating model parameters to encode new document-docid associations incurs repeated training and catastrophic forgetting of previously indexed documents. In this work, we revisit incremental GR as an in-context retrieval problem, where newly added documents are supplied as inference-time document-docid evidence. We propose ICICLE, an in-context indexing framework that performs source-aware docid generation over both parametric memory and context-provided document-docid pairs. ICICLE combines a `[COPY]`-based routing mechanism, preference-based calibration, and large context adaptation to distinguish context-grounded retrieval from parametric retrieval. Experiments on MS MARCO and NQ320K show that ICICLE improves retrieval of newly introduced documents while preserving seen-document retention without corpus-specific retraining. Our analysis further shows that high-shot degradation is mainly caused by routing failure, highlighting source-selection calibration as a key bottleneck for scaling in-context generative retrieval.
翻译:摘要:生成式检索(GR)通过参数化知识直接将查询映射到文档标识符(docids)。然而,这种设计使得语料库扩展成本高昂:新增文档需更新模型参数以编码新的文档-docid关联,这不仅导致重复训练,还会引发先前已索引文档的灾难性遗忘。本文重新审视增量式生成式检索,将其视为一种上下文检索问题——新增文档作为推理时的文档-docid证据提供。我们提出ICICLE,一种基于上下文索引的框架,能够在参数化记忆与上下文提供的文档-docid对之间执行源感知的docid生成。该框架融合了基于`[COPY]`的路由机制、偏好校准以及大上下文适应技术,以区分基于上下文的检索与参数化检索。在MS MARCO和NQ320K数据集上的实验表明,ICICLE在提升新增文档检索性能的同时,无需针对语料库进行重新训练即可保持对已见文档的保留能力。进一步分析显示,高样本量场景下的性能退化主要由路由失效导致,这凸显了源选择校准作为扩展上下文生成式检索规模的关键瓶颈。