RAG-RewardBench：面向检索增强生成中偏好对齐的奖励模型基准评测 (RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment)

Despite the significant progress made by existing retrieval augmented language models (RALMs) in providing trustworthy responses and grounding in reliable sources, they often overlook effective alignment with human preferences. In the alignment process, reward models (RMs) act as a crucial proxy for human values to guide optimization. However, it remains unclear how to evaluate and select a reliable RM for preference alignment in RALMs. To this end, we propose RAG-RewardBench, the first benchmark for evaluating RMs in RAG settings. First, we design four crucial and challenging RAG-specific scenarios to assess RMs, including multi-hop reasoning, fine-grained citation, appropriate abstain, and conflict robustness. Then, we incorporate 18 RAG subsets, six retrievers, and 24 RALMs to increase the diversity of data sources. Finally, we adopt an LLM-as-a-judge approach to improve preference annotation efficiency and effectiveness, exhibiting a strong correlation with human annotations. Based on the RAG-RewardBench, we conduct a comprehensive evaluation of 45 RMs and uncover their limitations in RAG scenarios. Additionally, we also reveal that existing trained RALMs show almost no improvement in preference alignment, highlighting the need for a shift towards preference-aligned training.We release our benchmark and code publicly at https://huggingface.co/datasets/jinzhuoran/RAG-RewardBench/ for future work.

翻译：尽管现有的检索增强语言模型（RALMs）在提供可信赖的响应和基于可靠来源的论证方面取得了显著进展，但它们往往忽视了与人类偏好的有效对齐。在对齐过程中，奖励模型（RMs）作为人类价值观的关键代理，用于指导优化。然而，如何为RALMs中的偏好对齐评估和选择一个可靠的RM仍不明确。为此，我们提出了RAG-RewardBench，这是首个用于评估RAG设置下RMs的基准。首先，我们设计了四个关键且具有挑战性的RAG特定场景来评估RMs，包括多跳推理、细粒度引用、恰当弃权和冲突鲁棒性。接着，我们整合了18个RAG子集、六种检索器和24个RALMs，以增加数据源的多样性。最后，我们采用LLM-as-a-judge方法来提升偏好标注的效率和效果，该方法与人工标注展现出强相关性。基于RAG-RewardBench，我们对45个RMs进行了全面评估，并揭示了它们在RAG场景中的局限性。此外，我们还发现，现有经过训练的RALMs在偏好对齐方面几乎没有改进，这凸显了转向偏好对齐训练的必要性。我们已在 https://huggingface.co/datasets/jinzhuoran/RAG-RewardBench/ 公开发布了我们的基准和代码，以供未来研究使用。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

31+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日