Generating high-quality answers consistently by providing contextual information embedded in the prompt passed to the Large Language Model (LLM) is dependent on the quality of information retrieval. As the corpus of contextual information grows, the answer/inference quality of Retrieval Augmented Generation (RAG) based Question Answering (QA) systems declines. This work solves this problem by combining classical text classification with the Large Language Model (LLM) to enable quick information retrieval from the vector store and ensure the relevancy of retrieved information. For the same, this work proposes a new approach Context Augmented retrieval (CAR), where partitioning of vector database by real-time classification of information flowing into the corpus is done. CAR demonstrates good quality answer generation along with significant reduction in information retrieval and answer generation time.
翻译:通过向大语言模型(LLM)提示中嵌入上下文信息来持续生成高质量答案,其效果取决于信息检索的质量。随着上下文信息语料库的增长,基于检索增强生成(RAG)的问答(QA)系统的答案/推理质量会下降。本研究通过将经典文本分类与大语言模型(LLM)相结合来解决此问题,以实现从向量存储中快速检索信息并确保检索信息的相关性。为此,本文提出了一种新方法——上下文增强检索(CAR),该方法通过对流入语料库的信息进行实时分类来实现向量数据库的分区。CAR在保证高质量答案生成的同时,显著减少了信息检索和答案生成时间。