In this paper we present a multi-adapter retrieval augmented generation system (MARAGS) for Meta's Comprehensive RAG (CRAG) competition for KDD CUP 2024. CRAG is a question answering dataset contains 3 different subtasks aimed at realistic question and answering RAG related tasks, with a diverse set of question topics, question types, time dynamic answers, and questions featuring entities of varying popularity. Our system follows a standard setup for web based RAG, which uses processed web pages to provide context for an LLM to produce generations, while also querying API endpoints for additional information. MARAGS also utilizes multiple different adapters to solve the various requirements for these tasks with a standard cross-encoder model for ranking candidate passages relevant for answering the question. Our system achieved 2nd place for Task 1 as well as 3rd place on Task 2.
翻译:本文提出了一种多适配器检索增强生成系统(MARAGS),用于参与KDD CUP 2024中Meta公司举办的综合检索增强生成(CRAG)竞赛。CRAG是一个包含3个不同子任务的问答数据集,旨在模拟真实的检索增强生成相关任务,其问题主题多样、问题类型丰富、答案具有时间动态性,且问题涉及不同流行度的实体。本系统采用基于网络的检索增强生成标准架构,利用经处理的网页为大型语言模型提供生成所需的上下文,同时通过查询API端点获取补充信息。MARAGS还采用多种适配器来满足各子任务的不同需求,并配备标准的交叉编码器模型对与问题相关的候选文本段落进行排序。本系统在任务1中荣获第二名,在任务2中取得第三名。