As Large Language Models (LLMs) and Retrieval Augmentation Generation (RAG) techniques have evolved, query rewriting has been widely incorporated into the RAG system for downstream tasks like open-domain QA. Many works have attempted to utilize small models with reinforcement learning rather than costly LLMs to improve query rewriting. However, current methods require annotations (e.g., labeled relevant documents or downstream answers) or predesigned rewards for feedback, which lack generalization, and fail to utilize signals tailored for query rewriting. In this paper, we propose ours, a framework for training query rewriting models free of annotations. By leveraging a publicly available reranker, ours~provides feedback aligned well with the rewriting objectives. Experimental results demonstrate that ours~can obtain better performance than baselines.
翻译:随着大语言模型(LLM)与检索增强生成(RAG)技术的发展,查询重写已被广泛集成至RAG系统中,用于开放域问答等下游任务。许多研究尝试利用强化学习训练的小型模型替代昂贵的大语言模型以改进查询重写。然而,现有方法需要标注数据(如标记的相关文档或下游答案)或预先设计的奖励反馈,这类方法缺乏泛化能力,且未能充分利用针对查询重写定制的信号。本文提出一种无需标注数据的查询重写模型训练框架。通过利用公开可用的重排序器,该框架能提供与重写目标高度契合的反馈。实验结果表明,该框架能够获得优于基线模型的性能表现。