Vector search (VS) has become a fundamental component in multimodal data management, enabling core functionalities such as image, video, and code retrieval. As vector data scales rapidly, VS faces growing challenges in balancing search, latency, scalability, and cost. The evolution of VS has been closely driven by changes in storage architecture. Early VS methods rely on all-in-memory designs for low latency, but scalability is constrained by memory capacity and cost. To address this, recent research has adopted heterogeneous architectures that offload space-intensive vectors and index structures to SSDs, while exploiting block locality and I/O-efficient strategies to maintain high search performance at billion scale. Looking ahead, the increasing demand for trillion-scale vector retrieval and cloud-native elasticity is driving a further shift toward memory-SSD-object storage architectures, which enable cost-efficient data tiering and seamless scalability. In this tutorial, we review the evolution of VS techniques from a storage-architecture perspective. We first review memory-resident methods, covering classical IVF, hash, quantization, and graph-based designs. We then present a systematic overview of heterogeneous storage VS techniques, including their index designs, block-level layouts, query strategies, and update mechanisms. Finally, we examine emerging cloud-native systems and highlight open research opportunities for future large-scale vector retrieval systems.
翻译:向量检索已成为多模态数据管理的基础组件,支撑着图像、视频和代码检索等核心功能。随着向量数据规模的快速增长,向量检索在平衡搜索精度、延迟、可扩展性和成本方面面临日益严峻的挑战。向量检索技术的发展始终与存储架构的演进紧密相连。早期的向量检索方法依赖全内存设计以实现低延迟,但其可扩展性受限于内存容量与成本。为解决此问题,近期研究转向采用异构存储架构,将空间密集的向量与索引结构卸载至SSD,同时利用数据块局部性和I/O高效策略,在十亿级规模下维持高性能检索。展望未来,对万亿级向量检索和云原生弹性的需求正推动架构进一步向内存-SSD-对象存储层级演进,以实现成本优化的数据分层与无缝扩展。本教程从存储架构视角系统回顾向量检索技术的演进历程。我们首先梳理内存驻留方法,涵盖经典的倒排索引、哈希、量化和基于图的设计;随后系统综述异构存储向量检索技术,包括其索引设计、块级布局、查询策略与更新机制;最后探讨新兴的云原生系统,并指出未来大规模向量检索系统的开放研究方向。