We proposed an end-to-end system design towards utilizing Retrieval Augmented Generation (RAG) to improve the factual accuracy of Large Language Models (LLMs) for domain-specific and time-sensitive queries related to private knowledge-bases. Our system integrates RAG pipeline with upstream datasets processing and downstream performance evaluation. Addressing the challenge of LLM hallucinations, we finetune models with a curated dataset which originates from CMU's extensive resources and annotated with the teacher model. Our experiments demonstrate the system's effectiveness in generating more accurate answers to domain-specific and time-sensitive inquiries. The results also revealed the limitations of fine-tuning LLMs with small-scale and skewed datasets. This research highlights the potential of RAG systems in augmenting LLMs with external datasets for improved performance in knowledge-intensive tasks. Our code and models are available on Github.
翻译:我们提出了一种端到端系统设计,旨在利用检索增强生成(RAG)提升大型语言模型(LLM)在处理与私有知识库相关的领域特定且时效性强的查询时的事实准确性。该系统将RAG管线与上游数据集处理及下游性能评估相结合。针对LLM幻觉这一挑战,我们使用源自卡内基梅隆大学广泛资源并由教师模型标注的精选数据集对模型进行微调。实验表明,该系统在针对领域特定和时效性强的查询生成更准确回答方面具有有效性。结果同时揭示了使用小规模及偏态数据集微调LLM的局限性。本研究凸显了RAG系统通过外部数据集增强LLM以提升其在知识密集型任务中表现的潜力。我们的代码和模型已在GitHub上公开。