On-disk graph-based indexes are favored for billion-scale Approximate Nearest Neighbor Search (ANNS) due to their high performance and cost-efficiency. However, existing systems typically rely on a coupled storage architecture that co-locates vectors and graph topology, which introduces substantial redundant I/O during index updates, thereby degrading usability in dynamic workloads. In this paper, we propose a decoupled storage architecture that physically separates heavy vectors from the lightweight graph topology. This design substantially improves update performance by reducing redundant I/O during updates. However, it introduces I/O amplification during ANNS, leading to degraded query efficiency.To improve query performance within the update-friendly architecture, we propose two techniques co-designed with the decoupled storage. We develop a similarity-aware dynamic layout that optimizes data placement online so that redundantly fetched data can be reused in subsequent search steps, effectively turning read amplification into useful prefetching. In addition, we propose a two-stage query mechanism enhanced by hierarchical PQ, which uses hierarchical PQ to rapidly and accurately identify promising candidates and performs exact refinement on raw vectors for only a small number of candidates. This design significantly reduces both the I/O and computational cost of the refinement stage. Overall, DGAI achieves resource-efficient updates and low-latency queries simultaneously. Experimental results demonstrate that \oursys improves update speed by 8.17x for insertions and 8.16x for deletions, while reducing peak query latency under mixed workloads by 67\% compared to state-of-the-art baselines.
翻译:基于磁盘的图索引因高性能和成本效益而被广泛用于十亿级近似最近邻搜索(ANNS)。然而,现有系统通常采用耦合存储架构将向量与图拓扑共置,导致索引更新时产生大量冗余I/O,从而降低了动态工作负载下的可用性。本文提出一种解耦存储架构,在物理上将高维向量与轻量级图拓扑分离。该设计通过减少更新过程中的冗余I/O显著提升了更新性能,但会引发ANNS阶段的I/O放大问题,导致查询效率下降。为在更新友好的架构中提升查询性能,我们提出了两种与解耦存储协同设计的技术:首先开发了一种相似性感知动态布局,可在线优化数据放置,使得冗余获取的数据能在后续搜索步骤中复用,从而将读放大转化为有效预取;其次提出了一种基于层次化PQ增强的两阶段查询机制,通过层次化PQ快速准确识别候选结果,并仅对少量候选进行原始向量的精确精化。该设计大幅降低了精化阶段的I/O与计算开销。综上,DGAI同时实现了资源高效的更新与低延迟查询。实验结果表明,与现有最优基线相比,本系统在插入和删除操作中分别实现了8.17倍和8.16倍的更新速度提升,混合工作负载下的峰值查询延迟降低67%。