On-disk graph-based indexes are favored for billion-scale Approximate Nearest Neighbor Search (ANNS) due to their high performance and cost-efficiency. However, existing systems typically rely on a coupled storage architecture that co-locates vectors and graph topology, which introduces substantial redundant I/O during index updates, thereby degrading usability in dynamic workloads. In this paper, we propose a decoupled storage architecture that physically separates heavy vectors from the lightweight graph topology. This design substantially improves update performance by reducing redundant I/O during updates. However, it introduces I/O amplification during ANNS, leading to degraded query efficiency.To improve query performance within the update-friendly architecture, we propose two techniques co-designed with the decoupled storage. We develop a similarity-aware dynamic layout that optimizes data placement online so that redundantly fetched data can be reused in subsequent search steps, effectively turning read amplification into useful prefetching. In addition, we propose a two-stage query mechanism enhanced by hierarchical PQ, which uses hierarchical PQ to rapidly and accurately identify promising candidates and performs exact refinement on raw vectors for only a small number of candidates. This design significantly reduces both the I/O and computational cost of the refinement stage. Overall, DGAI achieves resource-efficient updates and low-latency queries simultaneously. Experimental results demonstrate that \oursys improves update speed by 8.17x for insertions and 8.16x for deletions, while reducing peak query latency under mixed workloads by 67\% compared to state-of-the-art baselines.
翻译:摘要:基于磁盘的图索引因其高性能和成本效益,被广泛用于十亿级近似最近邻搜索(ANNS)。然而,现有系统通常采用耦合存储架构,将向量与图拓扑共同存放,这在索引更新时会导致大量冗余I/O,从而降低动态工作负载下的可用性。本文提出一种分离式存储架构,将重型向量与轻量级图拓扑在物理上分离。该设计通过减少更新中的冗余I/O显著提升了更新性能,但同时也引入了ANNS过程中的I/O放大问题,导致查询效率下降。为在支持更新的友好架构中提升查询性能,我们提出了两种与分离式存储协同设计的技术。首先,我们开发了一种感知相似度的动态布局方法,能够在线优化数据放置,使冗余获取的数据可在后续搜索步骤中复用,从而将读放大转化为有益预取。其次,我们提出一种由分层PQ增强的两阶段查询机制:利用分层PQ快速精准地识别有潜力的候选集,仅对少量候选集进行原始向量的精确精化。该设计显著降低了精化阶段的I/O与计算开销。总体而言,DGAI同时实现了资源高效的更新与低延迟查询。实验结果表明,与当前最先进的基线方法相比,本系统将插入速度提升8.17倍、删除速度提升8.16倍,并在混合工作负载下将峰值查询延迟降低67%。