Retrieval-Augmented Generation equips large language models with the capability to retrieve external knowledge, thereby mitigating hallucinations by incorporating information beyond the model's intrinsic abilities. However, most prior works have focused on invoking retrieval deterministically, which makes it unsuitable for tasks such as long-form question answering. Instead, dynamically performing retrieval by invoking it only when the underlying LLM lacks the required knowledge can be more efficient. In this context, we delve deeper into the question, "To Retrieve or Not to Retrieve?" by exploring multiple uncertainty detection methods. We evaluate these methods for the task of long-form question answering, employing dynamic retrieval, and present our comparisons. Our findings suggest that uncertainty detection metrics, such as Degree Matrix Jaccard and Eccentricity, can reduce the number of retrieval calls by almost half, with only a slight reduction in question-answering accuracy.
翻译:检索增强生成技术通过赋予大型语言模型检索外部知识的能力,将超出模型内在知识范围的信息纳入生成过程,从而有效缓解幻觉问题。然而,现有研究大多采用确定性检索调用机制,这使其难以适用于长文本问答等任务。相比之下,仅当底层大型语言模型缺乏所需知识时才动态触发检索,能够实现更高的效率。在此背景下,我们通过探索多种不确定性检测方法,对“何时检索?”这一问题进行了深入研究。我们在长文本问答任务中采用动态检索机制评估了这些方法,并给出了比较分析。研究结果表明,使用如度矩阵杰卡德相似度和偏心率等不确定性检测指标,能够将检索调用次数减少近一半,同时仅导致问答准确率轻微下降。