With the advent of large language models (LLMs) and multimodal large language models (MLLMs), the potential of retrieval-augmented generation (RAG) has attracted considerable research attention. Various novel algorithms and models have been introduced to enhance different aspects of RAG systems. However, the absence of a standardized framework for implementation, coupled with the inherently complex RAG process, makes it challenging and time-consuming for researchers to compare and evaluate these approaches in a consistent environment. Existing RAG toolkits, such as LangChain and LlamaIndex, while available, are often heavy and inflexibly, failing to meet the customization needs of researchers. In response to this challenge, we develop \ours{}, an efficient and modular open-source toolkit designed to assist researchers in reproducing and comparing existing RAG methods and developing their own algorithms within a unified framework. Our toolkit has implemented 16 advanced RAG methods and gathered and organized 38 benchmark datasets. It has various features, including a customizable modular framework, multimodal RAG capabilities, a rich collection of pre-implemented RAG works, comprehensive datasets, efficient auxiliary pre-processing scripts, and extensive and standard evaluation metrics. Our toolkit and resources are available at https://github.com/RUC-NLPIR/FlashRAG.
翻译:随着大语言模型(LLM)和多模态大语言模型(MLLM)的出现,检索增强生成(RAG)的潜力已吸引了大量的研究关注。各种新颖的算法和模型被提出,以增强RAG系统的不同方面。然而,由于缺乏标准化的实现框架,加之RAG过程本身固有的复杂性,研究人员难以在一致的环境中耗时费力地比较和评估这些方法。现有的RAG工具包,如LangChain和LlamaIndex,虽然可用,但往往笨重且不够灵活,无法满足研究人员的定制需求。为应对这一挑战,我们开发了\ours{},这是一个高效、模块化的开源工具包,旨在帮助研究者在统一框架内复现和比较现有的RAG方法,并开发自己的算法。我们的工具包已实现了16种先进的RAG方法,并收集整理了38个基准数据集。它具有多种特性,包括可定制的模块化框架、多模态RAG能力、丰富的预实现RAG工作集合、全面的数据集、高效的辅助预处理脚本以及广泛且标准的评估指标。我们的工具包及相关资源可在 https://github.com/RUC-NLPIR/FlashRAG 获取。