A Systematic Framework for Enterprise Knowledge Retrieval: Leveraging LLM-Generated Metadata to Enhance RAG Systems

In enterprise settings, efficiently retrieving relevant information from large and complex knowledge bases is essential for operational productivity and informed decision-making. This research presents a systematic empirical framework for metadata enrichment using large language models (LLMs) to enhance document retrieval in Retrieval-Augmented Generation (RAG) systems. Our approach employs a structured pipeline that dynamically generates meaningful metadata for document segments, substantially improving their semantic representations and retrieval accuracy. Through a controlled 3 X 3 experimental matrix, we compare three chunking strategies -- semantic, recursive, and naive -- and evaluate their interactions with three embedding techniques -- content-only, TF-IDF weighted, and prefix-fusion -- isolating the contribution of each component through ablation analysis. The results demonstrate that metadata-enriched approaches consistently outperform content-only baselines, with recursive chunking paired with TF-IDF weighted embeddings yielding 82.5% precision and naive chunking with prefix-fusion achieving the strongest ranking quality (NDCG 0.813). Our evaluation employs cross-encoder reranking for silver-standard ground truth generation, with statistical significance confirmed via Bonferroni-corrected paired t-tests. These findings confirm that metadata enrichment improves vector space organization and retrieval effectiveness while maintaining sub-30 ms P95 latency, providing a quantitative decision framework for deploying high-performance, scalable RAG systems in enterprise settings.

翻译：在企业场景中，从庞大复杂的知识库中高效检索相关信息，对于提升运营生产力和支持精准决策至关重要。本研究提出了一种系统化的实证框架，通过利用大语言模型（LLM）进行元数据增强来提升检索增强生成（RAG）系统中的文档检索能力。该方法采用结构化流水线，动态生成文档片段的语义化元数据，显著改善了其语义表示与检索准确率。通过受控的3X3实验矩阵，我们比较了三种分块策略——语义分块、递归分块与朴素分块，并评估了它们与三种嵌入技术的交互效应：纯内容嵌入、TF-IDF加权嵌入及前缀融合嵌入，同时借助消融分析分离了各组件的贡献。实验结果表明，元数据增强方法始终优于纯内容基线；其中递归分块结合TF-IDF加权嵌入的方案实现了82.5%的精确率，而朴素分块搭配前缀融合嵌入的方案则取得了最优排序质量（NDCG 0.813）。本研究采用交叉编码器重排序生成银标准真实标签，并通过经Bonferroni校正的配对t检验确认统计显著性。这些发现证实了元数据增强在优化向量空间组织与检索效能的同时，能将P95延迟控制在30毫秒以内，从而为企业部署高性能、可扩展的RAG系统提供了量化决策框架。