Retrieve-augmented generation (RAG) frameworks have emerged as a promising solution to multi-hop question answering(QA) tasks since it enables large language models (LLMs) to incorporate external knowledge and mitigate their inherent knowledge deficiencies. Despite this progress, existing RAG frameworks, which usually follows the retrieve-then-read paradigm, often struggle with multi-hop QA with temporal information since it has difficulty retrieving and synthesizing accurate time-related information. To address the challenge, this paper proposes a novel framework called review-then-refine, which aims to enhance LLM performance in multi-hop QA scenarios with temporal information. Our approach begins with a review phase, where decomposed sub-queries are dynamically rewritten with temporal information, allowing for subsequent adaptive retrieval and reasoning process. In addition, we implement adaptive retrieval mechanism to minimize unnecessary retrievals, thus reducing the potential for hallucinations. In the subsequent refine phase, the LLM synthesizes the retrieved information from each sub-query along with its internal knowledge to formulate a coherent answer. Extensive experimental results across multiple datasets demonstrate the effectiveness of our proposed framework, highlighting its potential to significantly improve multi-hop QA capabilities in LLMs.
翻译:检索增强生成框架已成为解决多跳问答任务的一种有前景的方案,因为它使大语言模型能够整合外部知识并缓解其固有的知识缺陷。尽管取得了这一进展,但现有通常遵循“检索-阅读”范式的RAG框架在处理涉及时间信息的多跳问答时常常面临困难,因为其难以检索和综合准确的时间相关信息。为应对这一挑战,本文提出了一种名为“回顾-精炼”的新框架,旨在提升大语言模型在含时间信息的多跳问答场景中的性能。我们的方法始于回顾阶段,在此阶段,分解后的子查询会结合时间信息进行动态重写,从而实现后续的自适应检索与推理过程。此外,我们实施了自适应检索机制,以最小化不必要的检索,从而减少产生幻觉的可能性。在随后的精炼阶段,大语言模型综合每个子查询检索到的信息及其内部知识,以构建连贯的答案。在多个数据集上的广泛实验结果证明了我们所提框架的有效性,突显了其显著提升大语言模型多跳问答能力的潜力。