Approximate nearest neighbor (ANN) graph indices such as HNSW and Vamana construct their edge topology in full-precision or high-fidelity quantized metric spaces, relegating binary quantization (BQ) to a post-hoc distance estimator during search. We challenge this paradigm by asking: Can binary quantization build the graph, instead of merely accelerating graph search? We present QuIVer (Quantized Index for Vector Retrieval), a training-free ANN graph index that performs edge selection, pruning, and graph navigation entirely within a 2-bit Sign-Magnitude BQ metric space. QuIVer combines three mutually reinforcing mechanisms: (i) a 2-bit Sign-Magnitude encoding that preserves both sign and magnitude strength at 1/12 the memory of float32 vectors; (ii) Vamana alpha-diversity pruning executed directly on BQ distances, producing long-range navigational edges robust to quantization noise; and (iii) symmetric BQ beam search using only XOR/AND/Popcount, with a final float32 reranking step confined to a small candidate set. On MiniLM-1M (384-d), Cohere-1M (768-d), and DBpedia-OpenAI-1M (1536-d), QuIVer achieves >=91% Recall@10 at 16-39K QPS with 70-140-second construction and <0.9 GB hot memory -- outperforming hnswlib by ~16x and USearch HNSW by ~5x in throughput at comparable recall. Controlled experiments on six additional datasets -- including multimodal CLIP embeddings (RedCaps-512), word vectors (GloVe-100), CV features (SIFT-128, GIST-960), uniform random vectors, and a low-rank synthetic dataset -- precisely delineate QuIVer's applicability boundary: high recall requires cosine-native distributions with low effective dimensionality, while Vamana's graph reachability holds universally. Notably, multimodal CLIP embeddings achieve 78% recall at ef=64, revealing a continuous gradient between single-modality SOTA and non-contrastive usability.
翻译:暂无翻译