Despite the impressive capabilities of large language models (LLMs) across diverse applications, they still suffer from trustworthiness issues, such as hallucinations and misalignments. Retrieval-augmented language models (RAG) have been proposed to enhance the credibility of generations by grounding external knowledge, but the theoretical understandings of their generation risks remains unexplored. In this paper, we answer: 1) whether RAG can indeed lead to low generation risks, 2) how to provide provable guarantees on the generation risks of RAG and vanilla LLMs, and 3) what sufficient conditions enable RAG models to reduce generation risks. We propose C-RAG, the first framework to certify generation risks for RAG models. Specifically, we provide conformal risk analysis for RAG models and certify an upper confidence bound of generation risks, which we refer to as conformal generation risk. We also provide theoretical guarantees on conformal generation risks for general bounded risk functions under test distribution shifts. We prove that RAG achieves a lower conformal generation risk than that of a single LLM when the quality of the retrieval model and transformer is non-trivial. Our intensive empirical results demonstrate the soundness and tightness of our conformal generation risk guarantees across four widely-used NLP datasets on four state-of-the-art retrieval models.
翻译:尽管大语言模型(LLMs)在各类应用中展现出令人瞩目的能力,但其仍面临可信度问题,例如幻觉和失配。检索增强语言模型(RAG)通过引入外部知识以增强生成内容的可信度,然而其生成风险的理论基础仍未被探索。本文回答了以下三个问题:1)RAG是否确实能降低生成风险;2)如何为RAG和普通LLMs的生成风险提供可验证的保证;3)在何种充分条件下RAG模型能够降低生成风险。我们提出C-RAG,这是首个为RAG模型提供生成风险认证的框架。具体而言,我们对RAG模型进行共形风险分析,并认证生成风险的上置信界,称为共形生成风险。我们还在测试分布偏移的情况下,为一般有界风险函数提供了共形生成风险的理论保证。我们证明,当检索模型和Transformer的质量达到非平凡水平时,RAG的共形生成风险低于单个LLM。我们的广泛实证结果表明,在四个广泛使用的NLP数据集上,针对四种最先进的检索模型,我们的共形生成风险保证具有可靠性和严谨性。