Despite the impressive capabilities of large language models (LLMs) across diverse applications, they still suffer from trustworthiness issues, such as hallucinations and misalignments. Retrieval-augmented language models (RAG) have been proposed to enhance the credibility of generations by grounding external knowledge, but the theoretical understandings of their generation risks remains unexplored. In this paper, we answer: 1) whether RAG can indeed lead to low generation risks, 2) how to provide provable guarantees on the generation risks of RAG and vanilla LLMs, and 3) what sufficient conditions enable RAG models to reduce generation risks. We propose C-RAG, the first framework to certify generation risks for RAG models. Specifically, we provide conformal risk analysis for RAG models and certify an upper confidence bound of generation risks, which we refer to as conformal generation risk. We also provide theoretical guarantees on conformal generation risks for general bounded risk functions under test distribution shifts. We prove that RAG achieves a lower conformal generation risk than that of a single LLM when the quality of the retrieval model and transformer is non-trivial. Our intensive empirical results demonstrate the soundness and tightness of our conformal generation risk guarantees across four widely-used NLP datasets on four state-of-the-art retrieval models.
翻译:尽管大型语言模型(LLM)在不同应用中展现出令人印象深刻的能力,但其仍存在可信度问题,如幻觉和错位。检索增强语言模型(RAG)被提出通过基于外部知识来增强生成内容的可信度,但其生成风险的理论理解仍未得到探索。本文回答了以下问题:1)RAG是否确实能带来较低的生成风险;2)如何为RAG及原始LLM的生成风险提供可证明的保证;3)哪些充分条件能使RAG模型降低生成风险。我们提出了C-RAG,这是首个为RAG模型认证生成风险的框架。具体而言,我们为RAG模型提供了符合性风险分析,并认证了生成风险的上置信界,我们称之为符合性生成风险。我们还为测试分布偏移下的一般有界风险函数提供了符合性生成风险的理论保证。我们证明,当检索模型和Transformer的质量非平凡时,RAG能实现比单一LLM更低的符合性生成风险。我们的大量实证结果表明,在四种最先进的检索模型和四个广泛使用的NLP数据集上,我们的符合性生成风险保证具有稳健性和紧致性。