SINDI: an Efficient Index for Approximate Maximum Inner Product Search on Sparse Vectors

from arxiv, 18 pages, accepted by ICDE 2026. Due to submission limitation for ICDE 2026 (i.e., maximum 6 submissions per author), Lei Chen and Xuemin Lin are not included as authors

Sparse vector Maximum Inner Product Search (MIPS) is crucial in multi-path retrieval for Retrieval-Augmented Generation (RAG). Recent inverted index-based and graph-based algorithms have achieved high search accuracy with practical efficiency. However, their performance in production environments is often limited by redundant distance computations and frequent random memory accesses. Furthermore, the compressed storage format of sparse vectors hinders the use of SIMD acceleration. In this paper, we propose the sparse inverted non-redundant distance index (SINDI), which incorporates three key optimizations: (i) Efficient Inner Product Computation: SINDI leverages SIMD acceleration and eliminates redundant identifier lookups, enabling batched inner product computation; (ii) Memory-Friendly Design: SINDI replaces random memory accesses to original vectors with sequential accesses to inverted lists, substantially reducing memory-bound latency. (iii) Vector Pruning: SINDI retains only the high-magnitude non-zero entries of vectors, improving query throughput while maintaining accuracy. We evaluate SINDI on multiple real-world datasets. Experimental results show that SINDI achieves state-of-the-art performance across datasets of varying scales, languages, and models. On the MsMarco dataset, when Recall@50 exceeds 99%, SINDI delivers single-thread query-per-second (QPS) improvements ranging from 4.2$\times$ to 26.4$\times$ compared with SEISMIC and PyANNs. Notably, SINDI has been integrated into Ant Group's open-source vector search library, VSAG.

翻译：稀疏向量最大内积搜索（MIPS）在检索增强生成（RAG）的多路径检索中至关重要。近年来基于倒排索引和图结构的算法已在保证实用效率的同时实现了较高的搜索精度。然而，这些算法在生产环境中的性能常受限于冗余的距离计算和频繁的随机内存访问。此外，稀疏向量的压缩存储格式阻碍了SIMD加速的有效利用。本文提出稀疏倒排无冗余距离索引（SINDI），其包含三项关键优化：（i）高效内积计算：SINDI利用SIMD加速并消除冗余标识符查找，实现批量内积计算；（ii）内存友好设计：SINDI将对原始向量的随机内存访问替换为对倒排列表的顺序访问，显著降低内存受限延迟；（iii）向量剪枝：SINDI仅保留向量中高幅值的非零项，在保持精度的同时提升查询吞吐量。我们在多个真实数据集上评估SINDI。实验结果表明，SINDI在不同规模、语言和模型的数据集上均达到最先进的性能。在MsMarco数据集上，当Recall@50超过99%时，相较于SEISMIC和PyANNs，SINDI的单线程每秒查询数（QPS）提升了4.2倍至26.4倍。值得注意的是，SINDI已集成至蚂蚁集团的开源向量搜索库VSAG中。