Approximate Nearest Neighbor Search (ANNS) is a fundamental operation in vector databases, enabling efficient similarity search in high-dimensional spaces. While dense ANNS has been optimized using specialized hardware accelerators, sparse ANNS remains limited by CPU-based implementations, hindering scalability. This limitation is increasingly critical as hybrid retrieval systems, combining sparse and dense embeddings, become standard in Information Retrieval (IR) pipelines. We propose SpANNS, a near-memory processing architecture for sparse ANNS. SpANNS combines a hybrid inverted index with efficient query management and runtime optimizations. The architecture is built on a CXL Type-2 near-memory platform, where a specialized controller manages query parsing and cluster filtering, while compute-enabled DIMMs perform index traversal and distance computations close to the data. It achieves 15.2x to 21.6x faster execution over the state-of-the-art CPU baselines, offering scalable and efficient solutions for sparse vector search.
翻译:近似最近邻搜索(ANNS)是向量数据库中的基本操作,可实现高维空间中的高效相似性搜索。尽管稠密ANNS已通过专用硬件加速器得到优化,但稀疏ANNS仍受限于基于CPU的实现方式,阻碍了可扩展性。随着结合稀疏与稠密嵌入的混合检索系统成为信息检索(IR)流程的标准配置,这一局限性日益凸显。本文提出SpANNS——一种面向稀疏ANNS的近内存处理架构。SpANNS将混合倒排索引与高效查询管理及运行时优化相结合。该架构基于CXL Type-2近内存平台构建,其中专用控制器负责查询解析与聚类过滤,而具备计算能力的DIMM模块则在数据近端执行索引遍历与距离计算。相较于最先进的CPU基线方案,本架构实现了15.2倍至21.6倍的执行速度提升,为稀疏向量搜索提供了可扩展的高效解决方案。