AQR-HNSW: Accelerating Approximate Nearest Neighbor Search via Density-aware Quantization and Multi-stage Re-ranking

Approximate Nearest Neighbor (ANN) search has become fundamental to modern AI infrastructure, powering recommendation systems, search engines, and large language models across industry leaders from Google to OpenAI. Hierarchical Navigable Small World (HNSW) graphs have emerged as the dominant ANN algorithm, widely adopted in production systems due to their superior recall versus latency balance. However, as vector databases scale to billions of embeddings, HNSW faces critical bottlenecks: memory consumption expands, distance computation overhead dominates query latency, and it suffers suboptimal performance on heterogeneous data distributions. This paper presents Adaptive Quantization and Rerank HNSW (AQR-HNSW), a novel framework that synergistically integrates three strategies to enhance HNSW scalability. AQR-HNSW introduces (1) density-aware adaptive quantization, achieving 4x compression while preserving distance relationships; (2) multi-state re-ranking that reduces unnecessary computations by 35%; and (3) quantization-optimized SIMD implementations delivering 16-64 operations per cycle across architectures. Evaluation on standard benchmarks demonstrates 2.5-3.3x higher queries per second (QPS) than state-of-the-art HNSW implementations while maintaining over 98% recall, with 75% memory reduction for the index graph and 5x faster index construction.

翻译：近似最近邻（ANN）搜索已成为现代人工智能基础设施的核心技术，为从谷歌到OpenAI等行业领先企业的推荐系统、搜索引擎和大语言模型提供支持。其中，分层可导航小世界（HNSW）图已成为主流的ANN算法，因其在召回率与延迟之间的优异平衡而被广泛应用于生产系统。然而，随着向量数据库扩展到数十亿嵌入向量，HNSW面临关键瓶颈：内存消耗急剧增长，距离计算开销主导查询延迟，且在异构数据分布上性能欠佳。本文提出自适应量化与重排序HNSW（AQR-HNSW），这是一个通过协同整合三种策略以增强HNSW可扩展性的新型框架。AQR-HNSW引入了（1）密度感知自适应量化，在保持距离关系的同时实现4倍压缩；（2）多状态重排序机制，减少35%的不必要计算；（3）量化优化的SIMD实现，在不同架构上每周期可执行16至64次操作。在标准基准测试上的评估表明，相较于最先进的HNSW实现，AQR-HNSW每秒查询数（QPS）提升2.5-3.3倍，同时保持超过98%的召回率，索引图内存减少75%，索引构建速度加快5倍。

相关内容

排序

关注 313

排序是计算机内经常进行的一种操作，其目的是将一组“无序”的记录序列调整为“有序”的记录序列。分内部排序和外部排序。若整个排序过程不需要访问外存便能完成，则称此类排序问题为内部排序。反之，若参加排序的记录数量很大，整个序列的排序过程不可能在内存中完成，则称此类排序问题为外部排序。内部排序的过程是一个逐步扩大记录的有序序列长度的过程。

《红外图像中掩埋目标检测的深度学习方法》2026最新报告

专知会员服务

7+阅读 · 6月8日

DeepSeek专题研究：“低成本、高性能、强推理”三位一体，DeepSeek驱动高质量模型平价化

专知会员服务

80+阅读 · 2025年2月14日

《用神经网络构建预测区间：对自举和保形推理方法的经验评估》美国空军大学195页学位论文

专知会员服务

28+阅读 · 2022年7月20日