While retrieval augmented generation (RAG) has been swiftly adopted in industrial applications based on large language models (LLMs), there is no consensus on what are the best practices for building a RAG system in terms of what are the components, how to organize these components and how to implement each component for the industrial applications, especially in the medical domain. In this work, we first carefully analyze each component of the RAG system and propose practical alternatives for each component. Then, we conduct systematic evaluations on three types of tasks, revealing the best practices for improving the RAG system and how LLM-based RAG systems make trade-offs between performance and efficiency.
翻译:尽管基于大语言模型(LLM)的检索增强生成(RAG)技术已在工业应用中迅速普及,但关于构建RAG系统的最佳实践——包括系统应包含哪些组件、如何组织这些组件以及如何针对工业应用(尤其是医学领域)实现每个组件——尚未形成共识。本研究首先系统剖析RAG系统的各个组件,并为每个组件提出可行的替代方案。随后,通过对三类任务进行系统性评估,揭示了提升RAG系统性能的最佳实践,以及基于LLM的RAG系统如何在性能与效率之间进行权衡。