In multi-vector retrieval, both queries and data are represented as sets of high-dimensional vectors, enabling finer-grained semantic matching and improving retrieval quality over single-vector approaches. However, its practical adoption is held back by the lack of effective indexing algorithms. Existing work, attempting to reuse standard single-vector indexes, often fails to preserve multi-vector semantics or remains slow. In this work, we present GEM, a native indexing framework for multi-vector representations. The core idea is to construct a proximity graph directly over vector sets, preserving their fine-grained semantics while enabling efficient navigation. First, GEM designs a set-level clustering scheme. It associates each vector set with only its most informative clusters, effectively reducing redundancy without hurting semantic coverage. Then, it builds local proximity graphs within clusters and bridges them into a globally navigable structure. To handle the non-metric nature of multi-vector similarity, GEM decouples the graph construction metric from the final relevance score and injects semantic shortcuts to guide efficient navigation toward relevant regions. At query time, GEM launches beam search from multiple entry points and prunes paths early using cluster cues. To further enhance efficiency, a quantized distance estimation technique is used for both indexing and search. Across in-domain, out-of-domain, and multi-modal benchmarks, GEM achieves up to 16x speedup over state-of-the-art methods while matching or improving accuracy.
翻译:摘要:在多向量检索中,查询和数据均表示为高维向量的集合,这使得语义匹配粒度更细致,从而提升了相较于单向量方法的检索质量。然而,由于缺乏有效的索引算法,该方法的实际应用受到制约。现有研究尝试复用标准单向量索引结构,但往往难以保留多向量语义或检索速度缓慢。本文提出GEM——一个面向多向量表示的原生索引框架。其核心思想是直接在向量集合上构建邻近图,在保留细粒度语义的同时实现高效导航。首先,GEM设计了一种集合级别的聚类方案,仅将每个向量集合关联至其最具信息量的聚类,从而在不损害语义覆盖的前提下有效降低冗余。随后,算法在聚类内部构建局部邻近图,并将其桥接为全局可导航结构。为应对多向量相似度的非度量特性,GEM将图构建度量标准与最终相关性分数解耦,并注入语义捷径以引导导航高效趋向相关区域。在查询阶段,GEM从多个入口点启动波束搜索,并利用聚类线索提前剪枝路径。为进一步提升效率,索引构建和搜索过程均采用量化距离估计技术。在领域内、跨领域及多模态基准测试中,GEM相较现有最优方法实现了最高16倍加速,同时保持或提升了准确率。