Retrieval-augmented generation (RAG) systems that rely on semantic search often fail to retrieve the complete set of evidence for complex queries, particularly when information is distributed across multiple sources. Existing approaches either rely on iterative agentic retrieval, which can be inefficient, or maintain additional structures such as knowledge graphs, which introduce storage and maintenance overhead. In this paper, we propose GraphER, a graph-based enrichment and reranking framework that (1) leverages the organizational structure of data to capture proximity relationships beyond semantic similarity, (2) constructs a graph at query time based on these proximities, and (3) applies graph-based ranking to surface the top candidate documents. Experiments across table retrieval, multi-hop retrieval, and long-document retrieval benchmarks demonstrate consistent improvements in terms of retrieval completeness. Additionally, GraphER requires no additional graph infrastructure and integrates seamlessly with standard vector stores. The framework is retriever-agnostic, supports multiple forms of proximity, and introduces minimal query-time latency.
翻译:检索增强生成(RAG)系统依赖语义搜索时,往往无法为复杂查询检索到完整的证据集,特别是当信息分布在多个来源时。现有方法要么依赖迭代型智能体检索(效率较低),要么维护知识图谱等附加结构(带来存储与维护开销)。本文提出GraphER——一种基于图的增强与重排序框架,它能够:(1)利用数据的组织结构,捕捉超越语义相似性的邻近关系;(2)在查询时基于这些邻近关系构建图结构;(3)应用基于图的排序来筛选出最优候选文档。在表格检索、多跳检索和长文档检索基准上的实验表明,该方法在检索完备性方面取得了持续改进。此外,GraphER无需额外图基础设施,并能与标准向量数据库无缝集成。该框架与检索器无关,支持多种形式的邻近关系,且引入的查询时延迟极低。