Parametric Retrieval-Augmented Generation (PRAG) encodes external documents into lightweight parameter modules that can be retrieved and merged at inference time, offering a promising alternative to in-context retrieval augmentation. Despite its potential, many PRAG implementations train document adapters with task-supervised objectives, which may cause each adapter to encode both document-specific facts and reusable task-solving behavior. This entanglement may make adapter composition less reliable: when multiple adapters are merged at inference time, their overlapping task behaviors can accumulate together with document-specific updates, potentially making the merged adapter less stable and less focused on the intended document knowledge. To examine this issue, we explore Orthogonal Subspace Decomposition (OSD), an adapter-training setup that separates reusable task behavior from document-specific knowledge adapters. Concretely, we first train a Task LoRA to capture reusable task behavior, and then train document LoRAs to encode document-specific knowledge in a orthogonal subspace. This setup provides a controlled way to examine how orthogonalizing task and document LoRA updates affects adapter composition in multi-document PRAG. Experiments across multiple knowledge-intensive tasks and model scales suggest that this orthogonalization strategy can improve compositional robustness in parametric RAG, especially when multiple document adapters are merged.
翻译:参数化检索增强生成(PRAG)通过将外部文档编码为轻量级参数模块,这些模块可在推理时被检索和合并,为上下文检索增强提供了一种有前景的替代方案。尽管具有潜力,许多PRAG实现采用任务监督目标训练文档适配器,这可能导致每个适配器同时编码文档特有事实和可重用的任务求解行为。这种纠缠可能使适配器组合可靠性降低:当多个适配器在推理时合并时,其重叠的任务行为会与文档特有更新共同累积,可能使合并后的适配器稳定性下降,且对预期文档知识的聚焦程度降低。为探究该问题,我们提出正交子空间分解(OSD)方法,这是一种将可重用任务行为与文档特有知识适配器分离的适配器训练框架。具体而言,我们首先训练一个任务LoRA以捕获可重用任务行为,随后在正交子空间中训练文档LoRA以编码文档特有知识。该框架为研究任务LoRA和文档LoRA更新的正交化如何影响多文档PRAG中的适配器组合提供了可控的试验平台。在多个知识密集型任务及不同模型规模上的实验表明,这种正交化策略可提升参数化RAG的组合鲁棒性,尤其在合并多个文档适配器时效果显著。