EnronQA：面向私有文档的个性化检索增强生成 (EnronQA: Towards Personalized RAG over Private Documents)

Retrieval Augmented Generation (RAG) has become one of the most popular methods for bringing knowledge-intensive context to large language models (LLM) because of its ability to bring local context at inference time without the cost or data leakage risks associated with fine-tuning. A clear separation of private information from the LLM training has made RAG the basis for many enterprise LLM workloads as it allows the company to augment LLM's understanding using customers' private documents. Despite its popularity for private documents in enterprise deployments, current RAG benchmarks for validating and optimizing RAG pipelines draw their corpora from public data such as Wikipedia or generic web pages and offer little to no personal context. Seeking to empower more personal and private RAG we release the EnronQA benchmark, a dataset of 103,638 emails with 528,304 question-answer pairs across 150 different user inboxes. EnronQA enables better benchmarking of RAG pipelines over private data and allows for experimentation on the introduction of personalized retrieval settings over realistic data. Finally, we use EnronQA to explore the tradeoff in memorization and retrieval when reasoning over private documents.

翻译：检索增强生成（RAG）已成为为大型语言模型（LLM）引入知识密集型上下文的最流行方法之一，因其能够在推理时引入本地上下文，且无需承担微调相关的成本或数据泄露风险。由于将私有信息与LLM训练明确分离，RAG已成为许多企业LLM工作负载的基础，使企业能够利用客户的私有文档增强LLM的理解能力。尽管RAG在企业部署的私有文档处理中广受欢迎，但当前用于验证和优化RAG流程的基准测试均采用维基百科或通用网页等公共数据作为语料库，几乎不提供个性化上下文。为赋能更具个性化和隐私性的RAG，我们发布了EnronQA基准测试数据集，该数据集包含150个不同用户收件箱中的103,638封电子邮件，涵盖528,304个问答对。EnronQA能够更好地对基于私有数据的RAG流程进行基准测试，并支持在真实数据上开展个性化检索设置的实验研究。最后，我们利用EnronQA探讨了在私有文档推理过程中记忆与检索之间的权衡关系。

相关内容

大语言模型

关注 64

大语言模型是基于海量文本数据训练的深度学习模型。它不仅能够生成自然语言文本，还能够深入理解文本含义，处理各种自然语言任务，如文本摘要、问答、翻译等。2023年，大语言模型及其在人工智能领域的应用已成为全球科技研究的热点，其在规模上的增长尤为引人注目，参数量已从最初的十几亿跃升到如今的一万亿。参数量的提升使得模型能够更加精细地捕捉人类语言微妙之处，更加深入地理解人类语言的复杂性。在过去的一年里，大语言模型在吸纳新知识、分解复杂任务以及图文对齐等多方面都有显著提升。随着技术的不断成熟，它将不断拓展其应用范围，为人类提供更加智能化和个性化的服务，进一步改善人们的生活和生产方式。

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日