We propose a natural language prompt-based retrieval augmented generation (Prompt-RAG), a novel approach to enhance the performance of generative large language models (LLMs) in niche domains. Conventional RAG methods mostly require vector embeddings, yet the suitability of generic LLM-based embedding representations for specialized domains remains uncertain. To explore and exemplify this point, we compared vector embeddings from Korean Medicine (KM) and Conventional Medicine (CM) documents, finding that KM document embeddings correlated more with token overlaps and less with human-assessed document relatedness, in contrast to CM embeddings. Prompt-RAG, distinct from conventional RAG models, operates without the need for embedding vectors. Its performance was assessed through a Question-Answering (QA) chatbot application, where responses were evaluated for relevance, readability, and informativeness. The results showed that Prompt-RAG outperformed existing models, including ChatGPT and conventional vector embedding-based RAGs, in terms of relevance and informativeness. Despite challenges like content structuring and response latency, the advancements in LLMs are expected to encourage the use of Prompt-RAG, making it a promising tool for other domains in need of RAG methods.
翻译:摘要:我们提出了一种基于自然语言提示的检索增强生成方法(Prompt-RAG),这是一种新颖的方法,旨在提升生成式大型语言模型(LLM)在专业小众领域中的性能。传统的RAG方法大多需要向量嵌入,但通用LLM生成的嵌入表示对专业领域的适用性仍存在不确定性。为了探索并验证这一观点,我们比较了来自朝鲜医学(KM)和常规医学(CM)文档的向量嵌入,发现与CM嵌入相比,KM文档嵌入与词元重叠的相关性更高,而与人工评估的文档相关性的相关性较低。Prompt-RAG与传统RAG模型不同,它无需嵌入向量即可运行。通过一个问答(QA)聊天机器人应用对其性能进行了评估,其中对回答的相关性、可读性和信息量进行了评估。结果表明,在相关性和信息量方面,Prompt-RAG优于现有模型,包括ChatGPT和基于传统向量嵌入的RAG。尽管存在内容结构和响应延迟等挑战,但LLM的进步预计将推动Prompt-RAG的应用,使其成为其他需要RAG方法的领域中有前景的工具。