While the flexible capabilities of large language models (LLMs) allow them to answer a range of queries based on existing learned knowledge, information retrieval to augment generation is an important tool to allow LLMs to answer questions on information not included in pre-training data. Such private information is increasingly being generated in a wide array of distributed contexts by organizations and individuals. Performing such information retrieval using neural embeddings of queries and documents always leaked information about queries and database content unless both were stored locally. We present Private Retrieval Augmented Generation (PRAG), an approach that uses multi-party computation (MPC) to securely transmit queries to a distributed set of servers containing a privately constructed database to return top-k and approximate top-k documents. This is a first-of-its-kind approach to dense information retrieval that ensures no server observes a client's query or can see the database content. The approach introduces a novel MPC friendly protocol for inverted file approximate search (IVF) that allows for fast document search over distributed and private data in sublinear communication complexity. This work presents new avenues through which data for use in LLMs can be accessed and used without needing to centralize or forgo privacy.
翻译:尽管大语言模型(LLMs)的灵活能力使其能够基于已有知识回答各类查询,但信息检索作为增强生成过程的重要工具,能让LLMs回答预训练数据中未包含的信息问题。此类私有信息正由组织和个人在广泛的分布式场景中日益产生。若使用查询和文档的神经嵌入进行信息检索,除非两者均存储在本地,否则始终会泄露查询和数据库内容的信息。我们提出私有检索增强生成(PRAG)方法,该方法利用多方计算(MPC)将查询安全传输至一组包含私有数据库的分布式服务器,以返回top-k及近似top-k文档。这是首个确保任何服务器无法观察到客户端查询或数据库内容的密集信息检索方法。该方法引入了一种新颖的MPC友好型倒排文件近似搜索(IVF)协议,可在亚线性通信复杂度下实现分布式私有数据的快速文档搜索。本研究开辟了新路径,使LLMs所需的数据无需集中化或牺牲隐私即可被访问和使用。