Retrieval-augmented generation (RAG) has emerged as a promising solution for mitigating hallucinations of large language models (LLMs) with retrieved external knowledge. Adaptive RAG enhances this approach by enabling dynamic retrieval during generation, activating retrieval only when the query exceeds LLM's internal knowledge. Existing methods primarily focus on detecting LLM's confidence via statistical uncertainty. Instead, we present the first attempts to solve adaptive RAG from a representation perspective and develop an inherent control-based framework, termed \name. Specifically, we extract the features that represent the honesty and confidence directions of LLM and adopt them to control LLM behavior and guide retrieval timing decisions. We also design a simple yet effective query formulation strategy to support adaptive retrieval. Experiments show that \name is superior to existing adaptive RAG methods on a diverse set of tasks, the honesty steering can effectively make LLMs more honest and confidence monitoring is a promising indicator of retrieval trigger.Our code is available at \url{https://github.com/HSLiu-Initial/CtrlA}.
翻译:检索增强生成(RAG)已成为一种利用检索到的外部知识来缓解大语言模型(LLM)幻觉问题的有前景的解决方案。自适应RAG通过支持在生成过程中动态检索,仅在查询超出LLM内部知识范围时才激活检索,从而增强了该方法。现有方法主要侧重于通过统计不确定性来检测LLM的置信度。相反,我们首次尝试从表示视角解决自适应RAG问题,并开发了一个基于内在控制的框架,命名为CtrlA。具体而言,我们提取了代表LLM诚实性和置信度方向的特征,并利用它们来控制LLM行为并指导检索时机的决策。我们还设计了一种简单而有效的查询构建策略来支持自适应检索。实验表明,CtrlA在多种任务上优于现有的自适应RAG方法,诚实性引导能有效使LLM更加诚实,而置信度监控则是一个有前景的检索触发指示器。我们的代码发布于 \url{https://github.com/HSLiu-Initial/CtrlA}。