Semantic search in retrieval-augmented generation (RAG) systems is often insufficient for complex information needs, particularly when relevant evidence is scattered across multiple sources. Prior approaches to this problem include agentic retrieval strategies, which expand the semantic search space by generating additional queries. However, these methods do not fully leverage the organizational structure of the data and instead rely on iterative exploration, which can lead to inefficient retrieval. Another class of approaches employs knowledge graphs to model non-semantic relationships through graph edges. Although effective in capturing richer proximities, such methods incur significant maintenance costs and are often incompatible with the vector stores used in most production systems. To address these limitations, we propose GraphER, a graph-based enrichment and reranking method that captures multiple forms of proximity beyond semantic similarity. GraphER independently enriches data objects during offline indexing and performs graph-based reranking over candidate objects at query time. This design does not require a knowledge graph, allowing GraphER to integrate seamlessly with standard vector stores. In addition, GraphER is retriever-agnostic and introduces negligible latency overhead. Experiments on multiple retrieval benchmarks demonstrate the effectiveness of the proposed approach.
翻译:检索增强生成(RAG)系统中的语义搜索往往难以满足复杂信息需求,尤其在相关证据分散于多个来源时。针对此问题,现有方法包括智能体检索策略,通过生成额外查询来扩展语义搜索空间。然而,这些方法未能充分利用数据的组织结构,而是依赖迭代式探索,可能导致检索效率低下。另一类方法采用知识图谱通过图边建模非语义关系。尽管这类方法能有效捕获更丰富的邻近性,但其维护成本高昂,且常与多数生产系统中使用的向量存储不兼容。为解决上述局限,我们提出GraphER——一种基于图的富集与重排序方法,可捕获语义相似性之外的多重邻近性。GraphER在离线索引阶段独立富化数据对象,并在查询阶段对候选对象执行基于图的重排序。该设计无需知识图谱,使GraphER能够无缝集成标准向量存储。此外,GraphER具有检索器无关性,且引入的延迟开销可忽略不计。在多个检索基准上的实验验证了所提方法的有效性。