While retrieval augmented generation (RAG) has been swiftly adopted in industrial applications based on large language models (LLMs), there is no consensus on what are the best practices for building a RAG system in terms of what are the components, how to organize these components and how to implement each component for the industrial applications, especially in the medical domain. In this work, we first carefully analyze each component of the RAG system and propose practical alternatives for each component. Then, we conduct systematic evaluations on three types of tasks, revealing the best practices for improving the RAG system and how LLM-based RAG systems make trade-offs between performance and efficiency.
翻译:尽管基于大语言模型的检索增强生成技术已在工业应用中迅速普及,但关于构建RAG系统的最佳实践——包括系统组件构成、组件组织方式及各组件在工业应用(尤其是医学领域)中的具体实现方案——尚未形成共识。本研究首先系统剖析RAG系统的各个组件,并为每个组件提出可行的替代方案。随后通过对三类任务进行系统性评估,揭示提升RAG系统性能的最佳实践路径,以及基于大语言模型的RAG系统如何在性能与效率之间实现权衡。