With the rapid development of large-scale language models, Retrieval-Augmented Generation (RAG) has been widely adopted. However, existing RAG paradigms are inevitably influenced by erroneous retrieval information, thereby reducing the reliability and correctness of generated results. Therefore, to improve the relevance of retrieval information, this study proposes a method that replaces traditional retrievers with GPT-3.5, leveraging its vast corpus knowledge to generate retrieval information. We also propose a web retrieval based method to implement fine-grained knowledge retrieval, Utilizing the powerful reasoning capability of GPT-3.5 to realize semantic partitioning of problem.In order to mitigate the illusion of GPT retrieval and reduce noise in Web retrieval,we proposes a multi-source retrieval framework, named MSRAG, which combines GPT retrieval with web retrieval. Experiments on multiple knowledge-intensive QA datasets demonstrate that the proposed framework in this study performs better than existing RAG framework in enhancing the overall efficiency and accuracy of QA systems.
翻译:随着大规模语言模型的快速发展,检索增强生成(RAG)技术已得到广泛应用。然而,现有RAG范式不可避免地受到错误检索信息的影响,从而降低了生成结果的可靠性与正确性。为此,为提高检索信息的相关性,本研究提出一种使用GPT-3.5替代传统检索器的方法,利用其庞大的语料库知识生成检索信息。我们还提出一种基于网络检索的方法以实现细粒度知识检索,借助GPT-3.5强大的推理能力实现问题的语义划分。为缓解GPT检索的幻觉现象并降低网络检索中的噪声,本研究提出一种融合GPT检索与网络检索的多源检索框架MSRAG。在多个知识密集型问答数据集上的实验表明,本研究所提框架在提升问答系统整体效率与准确性方面优于现有RAG框架。