Retrieval-augmented generation (RAG) enhances large language models (LLMs) by retrieving external documents. As an emerging form of RAG, parametric retrieval-augmented generation (PRAG) encodes documents as model parameters (i.e., LoRA modules) and injects these representations into the model during inference, enabling interaction between the LLM and documents at parametric level. Compared with directly placing documents in the input context, PRAG is more efficient and has the potential to offer deeper model-document interaction. Despite its growing attention, the mechanism underlying parametric injection remains poorly understood. In this work, we present a systematic study of PRAG to clarify the role of parametric injection, showing that parameterized documents capture only partial semantic information of documents, and relying on them alone yields inferior performance compared to interaction at text level. However, these parametric representations encode high-level document information that can enhance the model's understanding of documents within the input context. When combined parameterized documents with textual documents, the model can leverage relevant information more effectively and become more robust to noisy inputs, achieving better performance than either source alone. We recommend jointly using parameterized and textual documents and advocate for increasing the information content of parametric representations to advance PRAG.
翻译:检索增强生成(RAG)通过检索外部文档来增强大语言模型(LLM)。作为RAG的一种新兴形式,参数化检索增强生成(PRAG)将文档编码为模型参数(即LoRA模块),并在推理过程中将这些表示注入模型,从而在参数层面实现LLM与文档的交互。与直接将文档置于输入上下文相比,PRAG效率更高,并具备实现更深层次模型-文档交互的潜力。尽管其日益受到关注,参数化注入的内在机制仍不甚明晰。本研究对PRAG进行了系统性分析,以阐明参数化注入的作用:研究表明参数化文档仅捕获了文档的部分语义信息,单独依赖此类参数化交互的性能逊于文本层面的交互。然而,这些参数化表示编码了文档的高层信息,能够增强模型对输入上下文中文档的理解。当参数化文档与文本化文档结合使用时,模型能更有效地利用相关信息,并对噪声输入表现出更强的鲁棒性,其性能优于单独使用任一信息源。我们建议联合使用参数化与文本化文档,并主张通过提升参数化表示的信息容量来推动PRAG技术的发展。