面向检索增强型大语言模型的多任务嵌入器 (A Multi-Task Embedder For Retrieval Augmented LLMs)

LLMs confront inherent limitations in terms of its knowledge, memory, and action. The retrieval augmentation stands as a vital mechanism to address these limitations, which brings in useful information from external sources to augment the LLM. However, existing retrieval methods encounter two pressing issues. On one hand, the general retrievers are not properly optimized for retrieval augmentation hence exhibit limited effectiveness; on the other hand, the task-specific retrievers excel in the targeted retrieval augmentation scenario, while lack the versatility to handle diverse scenarios. In this work, we propose \textbf{LLM-Embedder} for the unified support of diverse retrieval augmentation scenarios. Our method presents three technical contributions. Firstly, we introduce a new \textit{reward formulation}, namely {rank-aware reward}. It exploits the ranking position of the desired output among $N$ sampled outputs from the LLM, which leads to fine-grained and robust computation of reward from the LLM's feedback. Secondly, we design a novel \textit{distillation objective}, called graded distillation. It incorporates both the absolute value and the relative order of the reward for more sufficient utilization of the LLM's feedback. Thirdly, we systematically optimize the \textit{multi-task learning}, which effectively unifies the multiple retrieval functionalities into one model. In our experiment, LLM-Embedder notably improves the LLM's performances in various downstream tasks, and outperforms both general and task-specific retrievers with a substantial advantage.

翻译：大语言模型在其知识、记忆和行动方面面临固有的局限性。检索增强是解决这些局限性的关键机制，它从外部来源引入有用信息以增强大语言模型。然而，现有的检索方法面临两个紧迫问题。一方面，通用检索器未针对检索增强进行充分优化，因此效果有限；另一方面，任务专用检索器在目标检索增强场景中表现出色，但缺乏处理多样化场景的通用性。在本工作中，我们提出 \textbf{LLM-Embedder}，旨在为多样化的检索增强场景提供统一支持。我们的方法提出了三项技术贡献。首先，我们引入了一种新的\textit{奖励形式化方法}，即{排序感知奖励}。它利用大语言模型$N$个采样输出中期望输出的排序位置，从而从大语言模型的反馈中实现细粒度且稳健的奖励计算。其次，我们设计了一种新颖的\textit{蒸馏目标}，称为分级蒸馏。它同时结合了奖励的绝对值和相对顺序，以更充分地利用大语言模型的反馈。第三，我们系统性地优化了\textit{多任务学习}，有效地将多种检索功能统一到一个模型中。在我们的实验中，LLM-Embedder显著提升了大语言模型在各种下游任务中的性能，并以显著优势超越了通用检索器和任务专用检索器。