Despite the impressive capabilities of large language models (LLMs) across diverse applications, they still suffer from trustworthiness issues, such as hallucinations and misalignments. Retrieval-augmented language models (RAG) have been proposed to enhance the credibility of generations by grounding external knowledge, but the theoretical understandings of their generation risks remains unexplored. In this paper, we answer: 1) whether RAG can indeed lead to low generation risks, 2) how to provide provable guarantees on the generation risks of RAG and vanilla LLMs, and 3) what sufficient conditions enable RAG models to reduce generation risks. We propose C-RAG, the first framework to certify generation risks for RAG models. Specifically, we provide conformal risk analysis for RAG models and certify an upper confidence bound of generation risks, which we refer to as conformal generation risk. We also provide theoretical guarantees on conformal generation risks for general bounded risk functions under test distribution shifts. We prove that RAG achieves a lower conformal generation risk than that of a single LLM when the quality of the retrieval model and transformer is non-trivial. Our intensive empirical results demonstrate the soundness and tightness of our conformal generation risk guarantees across four widely-used NLP datasets on four state-of-the-art retrieval models.
翻译:尽管大语言模型(LLMs)在多种应用中展现出令人瞩目的能力,但其仍存在可信性问题,如幻觉和错位。检索增强语言模型(RAG)通过引入外部知识来增强生成内容的可信度,但其生成风险的理论理解尚未得到探索。本文回答以下问题:1)RAG是否确实能降低生成风险;2)如何为RAG和普通LLM的生成风险提供可证明的保证;3)哪些充分条件能使RAG模型降低生成风险。我们提出C-RAG,这是首个为RAG模型认证生成风险的框架。具体而言,我们为RAG模型提供了共形风险分析,并认证了生成风险的上置信界,称之为共形生成风险。我们还提供了测试分布偏移下一般有界风险函数的共形生成风险的理论保证。我们证明,当检索模型和Transformer的质量非平凡时,RAG的共形生成风险低于单个LLM。我们基于四种最先进检索模型在四个广泛使用的自然语言处理数据集上的大量实证结果,展示了共形生成风险保证的合理性和紧致性。