Retrieval-Augmented Generation Must Move Beyond Factual Grounding to Represent Diverse Opinions

This position paper argues that Retrieval-Augmented Generation systems exhibit a systematic factual bias-optimizing for epistemic uncertainty reduction while ignoring the aleatoric uncertainty inherent in opinion-rich content - and that this misalignment demands a paradigm shift in retrieval system design. A survey of 35 major RAG benchmarks reveals that only one addresses opinion synthesis, confirming that the bias is structural: embedded in datasets, retrieval objectives, and evaluation metrics alike. Beyond technical limitations, this bias poses risks to transparent and accountable AI: echo chamber effects that amplify dominant viewpoints, systematic under-representation of minority voices, and potential opinion manipulation through biased information synthesis. We formalize the problem through the lens of uncertainty quantification, showing that factual queries should minimize posterior entropy while opinion queries must preserve it, and derive a unified objective over coverage, fidelity, and fairness using the Wasserstein distance. As an existence proof, we present Opinion-Aware RAG (O-RAG), an architecture featuring LLM-based opinion extraction and entity-linked opinion metadata, and evaluate it across two domains - e-commerce seller forums and public hotel reviews - spanning 10K+ discussions and 6K+ customer reviews. Experiments demonstrate 18-48% reduction in Wasserstein distance to corpus-level sentiment distributions, +26.8% sentiment diversity, and +42.7% entity match rate, with human evaluators preferring opinion-enriched responses 79.2% of the time. We propose a research agenda and argue that as RAG systems increasingly mediate access to information, their ability to represent diverse perspectives is not optional but essential.

翻译：本文立场论文指出，检索增强生成系统存在系统性事实偏差，即优先优化认知不确定性降低，而忽略意见丰富内容中固有的偶然不确定性，这种错位要求检索系统设计进行范式转变。通过对35个主要RAG基准的调研，仅有一个涉及观点合成，证实该偏差是结构性的：嵌入在数据集、检索目标和评估指标中。除技术局限外，此偏差对透明且负责任的AI构成风险：放大主流观点的回音室效应、少数声音的系统性表征不足，以及通过有偏见的信息合成操纵观点的潜在可能。我们从不确定性量化视角形式化该问题，表明事实查询应最小化后验熵，而观点查询必须保留后验熵，并利用Wasserstein距离推导出覆盖度、保真度和公平性的统一目标。作为存在性证明，我们提出观点感知RAG（O-RAG），一种基于LLM的观点提取和实体关联观点元数据的架构，并在电子商务卖家论坛和公开酒店评论两个领域进行评估，涵盖10K+讨论和6K+客户评论。实验表明，与语料级情感分布的Wasserstein距离减少18%-48%，情感多样性增加26.8%，实体匹配率提升42.7%，人类评估员在79.2%的情况下偏好观点增强响应。我们提出研究议程，并论证随着RAG系统日益成为信息访问的中介，其表征多样化视角的能力并非可选而是必要的。