Many language models now enhance their responses with retrieval capabilities, leading to the widespread adoption of retrieval-augmented generation (RAG) systems. However, despite retrieval being a core component of RAG, much of the research in this area overlooks the extensive body of work on fair ranking, neglecting the importance of considering all stakeholders involved. This paper presents the first systematic evaluation of RAG systems integrated with fair rankings. We focus specifically on measuring the fair exposure of each relevant item across the rankings utilized by RAG systems (i.e., item-side fairness), aiming to promote equitable growth for relevant item providers. To gain a deep understanding of the relationship between item-fairness, ranking quality, and generation quality in the context of RAG, we analyze nine different RAG systems that incorporate fair rankings across seven distinct datasets. Our findings indicate that RAG systems with fair rankings can maintain a high level of generation quality and, in many cases, even outperform traditional RAG systems, despite the general trend of a tradeoff between ensuring fairness and maintaining system-effectiveness. We believe our insights lay the groundwork for responsible and equitable RAG systems and open new avenues for future research. We publicly release our codebase and dataset at https://github.com/kimdanny/Fair-RAG.
翻译:当前许多语言模型通过检索能力增强其响应,推动了检索增强生成(RAG)系统的广泛应用。然而,尽管检索是RAG的核心组成部分,该领域的大量研究忽视了关于公平排序的广泛工作成果,未能充分考虑所有相关利益方的需求。本文首次对集成公平排序的RAG系统进行了系统性评估。我们特别关注衡量RAG系统所用排序中每个相关项目的公平曝光度(即项目侧公平性),旨在促进相关项目提供者的均衡发展。为深入理解RAG背景下项目公平性、排序质量与生成质量之间的关系,我们分析了整合公平排序的九种不同RAG系统在七个独立数据集上的表现。研究结果表明,尽管在确保公平性与维持系统效能之间通常存在权衡关系,采用公平排序的RAG系统仍能保持较高的生成质量,且在多数情况下甚至优于传统RAG系统。我们相信这些发现为构建负责任且公平的RAG系统奠定了基础,并为未来研究开辟了新路径。相关代码库与数据集已公开于https://github.com/kimdanny/Fair-RAG。