RAG typically assumes centralized access to documents, which breaks down when knowledge is distributed across private data silos. We propose a secure Federated RAG system built using Flower that performs local silo retrieval, while server-side aggregation and text generation run inside an attested, confidential compute environment, enabling confidential remote LLM inference even in the presence of honest-but-curious or compromised servers. We also propose a cascading inference approach that incorporates a non-confidential third-party model (e.g., Amazon Nova) as auxiliary context without weakening confidentiality.
翻译:检索增强生成(RAG)通常假设文档可集中访问,但当知识分散于私有数据孤岛时,这一假设不再成立。本文提出一种基于Flower构建的安全联邦RAG系统:该系统在本地数据孤岛执行检索,而服务端聚合与文本生成均在经认证的机密计算环境中运行,即便面临诚实但好奇或受攻击的服务器,仍能实现远程大语言模型(LLM)的机密推理。此外,我们提出一种级联推理方法,在不削弱机密性的前提下,将非机密的第三方模型(如Amazon Nova)作为辅助上下文纳入系统。