This paper presents EasyRAG, a simple, lightweight, and efficient retrieval-augmented generation framework for automated network operations. Our framework has three advantages. The first is accurate question answering. We designed a straightforward RAG scheme based on (1) a specific data processing workflow (2) dual-route sparse retrieval for coarse ranking (3) LLM Reranker for reranking (4) LLM answer generation and optimization. This approach achieved first place in the GLM4 track in the preliminary round and second place in the GLM4 track in the semifinals. The second is simple deployment. Our method primarily consists of BM25 retrieval and BGE-reranker reranking, requiring no fine-tuning of any models, occupying minimal VRAM, easy to deploy, and highly scalable; we provide a flexible code library with various search and generation strategies, facilitating custom process implementation. The last one is efficient inference. We designed an efficient inference acceleration scheme for the entire coarse ranking, reranking, and generation process that significantly reduces the inference latency of RAG while maintaining a good level of accuracy; each acceleration scheme can be plug-and-play into any component of the RAG process, consistently enhancing the efficiency of the RAG system. Our code and data are released at \url{https://github.com/BUAADreamer/EasyRAG}.
翻译:本文提出了EasyRAG,一个面向自动化网络运维的简洁、轻量且高效的检索增强生成框架。本框架具备三大优势。其一是精准问答能力。我们设计了一种基于以下四部分的直接RAG方案:(1) 特定数据处理流程 (2) 用于粗排的双路稀疏检索 (3) 用于重排的LLM重排序器 (4) LLM答案生成与优化。该方法在预选赛GLM4赛道中获得第一名,并在半决赛GLM4赛道中取得第二名。其二是部署简便。我们的方法主要由BM25检索与BGE重排序器构成,无需对任何模型进行微调,占用显存极少,易于部署且扩展性强;我们提供了包含多种搜索与生成策略的灵活代码库,便于用户实现定制化流程。其三是高效推理。我们为完整的粗排、重排及生成流程设计了一套高效推理加速方案,在保持良好准确率的同时显著降低了RAG的推理延迟;每个加速模块均可即插即用地集成到RAG流程的任意环节,持续提升RAG系统的运行效率。我们的代码与数据已发布于 \url{https://github.com/BUAADreamer/EasyRAG}。