Retrieval-Augmented Generation (RAG) has emerged as a powerful approach to mitigate large language model (LLM) hallucinations by incorporating external knowledge retrieval. However, existing RAG frameworks often apply retrieval indiscriminately,leading to inefficiencies-over-retrieving when unnecessary or failing to retrieve iteratively when required for complex reasoning. Recent adaptive retrieval strategies, though adaptively navigates these retrieval strategies, predict only based on query complexity and lacks user-driven flexibility, making them infeasible for diverse user application needs. In this paper, we introduce a novel user-controllable RAG framework that enables dynamic adjustment of the accuracy-cost trade-off. Our approach leverages two classifiers: one trained to prioritize accuracy and another to prioritize retrieval efficiency. Via an interpretable control parameter $\alpha$, users can seamlessly navigate between minimal-cost retrieval and high-accuracy retrieval based on their specific requirements. We empirically demonstrate that our approach effectively balances accuracy, retrieval cost, and user controllability, making it a practical and adaptable solution for real-world applications.
翻译:检索增强生成(RAG)已成为通过整合外部知识检索来缓解大语言模型(LLM)幻觉的一种有效方法。然而,现有RAG框架通常不加区分地执行检索,导致效率低下——在无需检索时过度检索,或在复杂推理需要时未能进行迭代检索。近期提出的自适应检索策略虽能动态调整检索方式,但其预测仅基于查询复杂度,缺乏用户驱动的灵活性,难以满足多样化的实际应用需求。本文提出一种新颖的用户可控RAG框架,支持动态调整准确性-成本权衡。该方法利用两个分类器:一个训练用于优先保证准确性,另一个则优先考虑检索效率。通过可解释的控制参数$\alpha$,用户可根据具体需求在最小成本检索与高精度检索之间无缝切换。实验结果表明,我们的方法能有效平衡准确性、检索成本与用户可控性,为实际应用提供了一种实用且适应性强的解决方案。