Retrieval-Augmented Generation (RAG) systems depend critically on document chunking quality for retrieving relevant context. Fixed chunking segments documents into uniform units irrespective of semantics or user intent, producing a precision-recall trade-off unresolvable by tuning chunk size alone. Semantic and agentic methods partially address these limitations but do not integrate user queries at the chunking stage. We present Query-Adaptive Semantic Chunking (QASC), which dynamically constructs chunks by integrating queries into segmentation through three mechanisms: cosine similarity scoring between sentence and query embeddings to identify seed sentences, contextual window expansion around seeds to preserve coherence, and chunk-level score aggregation to ensure holistic relevance. We evaluate QASC on 100 technical documents across 200 queries spanning four types, comparing against fixed chunking at five granularities, recursive splitting, semantic chunking, and agentic chunking. QASC achieves an F1-score of 0.85, a relative improvement of 18-27% over fixed chunking and 8-12% over semantic and agentic alternatives. Ablation studies confirm each component contributes meaningfully. Human evaluation by three annotators (Cohen kappa = 0.82) corroborates that QASC produces more relevant and coherent chunks than existing methods.
翻译:检索增强生成系统高度依赖文档分块质量以获取相关上下文。固定分块将文档切分为统一单元,忽略语义与用户意图,导致精确率与召回率之间难以通过调整分块粒度单独解决的权衡。语义分块与代理式方法虽部分缓解了上述局限,但未在分块阶段整合用户查询。我们提出查询自适应语义分块,通过三种机制将查询融入分割过程以动态构建分块:基于句子与查询嵌入的余弦相似度评分识别种子句,围绕种子句扩展上下文窗口以保持连贯性,以及通过分块级评分聚合确保整体相关性。我们在涵盖四类查询的100篇技术文档、200个查询上评估QASC,并与五种粒度下的固定分块、递归分割、语义分块及代理式分块进行对比。QASC的F1分数达0.85,较固定分块相对提升18-27%,较语义分块与代理式方法相对提升8-12%。消融实验证实各组件均有显著贡献。三名标注者的人工评估(Cohen kappa=0.82)进一步印证QASC较现有方法生成更相关且连贯的分块。