This study aims to improve the accuracy and quality of large-scale language models (LLMs) in answering questions by integrating Elasticsearch into the Retrieval Augmented Generation (RAG) framework. The experiment uses the Stanford Question Answering Dataset (SQuAD) version 2.0 as the test dataset and compares the performance of different retrieval methods, including traditional methods based on keyword matching or semantic similarity calculation, BM25-RAG and TF-IDF- RAG, and the newly proposed ES-RAG scheme. The results show that ES-RAG not only has obvious advantages in retrieval efficiency but also performs well in key indicators such as accuracy, which is 0.51 percentage points higher than TF-IDF-RAG. In addition, Elasticsearch's powerful search capabilities and rich configuration options enable the entire question-answering system to better handle complex queries and provide more flexible and efficient responses based on the diverse needs of users. Future research directions can further explore how to optimize the interaction mechanism between Elasticsearch and LLM, such as introducing higher-level semantic understanding and context-awareness capabilities, to achieve a more intelligent and humanized question-answering experience.
翻译:本研究旨在通过将Elasticsearch集成到检索增强生成框架中,提升大规模语言模型在问答任务中的准确性与质量。实验采用斯坦福问答数据集2.0版本作为测试数据集,对比了基于关键词匹配或语义相似度计算的传统检索方法(包括BM25-RAG与TF-IDF-RAG)与新提出的ES-RAG方案的性能。结果表明,ES-RAG不仅在检索效率上具有明显优势,在准确率等关键指标上也表现优异,其准确率较TF-IDF-RAG高出0.51个百分点。此外,Elasticsearch强大的搜索能力与丰富的配置选项使得整个问答系统能够更好地处理复杂查询,并根据用户的多样化需求提供更灵活高效的响应。未来的研究方向可进一步探索如何优化Elasticsearch与LLM的交互机制,例如引入更高层次的语义理解与上下文感知能力,以实现更智能化、人性化的问答体验。