On-disk graph-based indexes are favored for billion-scale Approximate Nearest Neighbor Search (ANNS) due to their high performance and cost-efficiency. However, existing systems typically rely on a coupled storage architecture that co-locates vectors and graph topology, which introduces substantial redundant I/O during index updates, thereby degrading usability in dynamic workloads. In this paper, we propose a decoupled storage architecture that physically separates heavy vectors from the lightweight graph topology. This design substantially improves update performance by reducing redundant I/O during updates. However, it introduces I/O amplification during ANNS, leading to degraded query efficiency.To improve query performance within the update-friendly architecture, we propose two techniques co-designed with the decoupled storage. We develop a similarity-aware dynamic layout that optimizes data placement online so that redundantly fetched data can be reused in subsequent search steps, effectively turning read amplification into useful prefetching. In addition, we propose a two-stage query mechanism enhanced by hierarchical PQ, which uses hierarchical PQ to rapidly and accurately identify promising candidates and performs exact refinement on raw vectors for only a small number of candidates. This design significantly reduces both the I/O and computational cost of the refinement stage. Overall, DGAI achieves resource-efficient updates and low-latency queries simultaneously. Experimental results demonstrate that \oursys improves update speed by 8.17x for insertions and 8.16x for deletions, while reducing peak query latency under mixed workloads by 67\% compared to state-of-the-art baselines.
翻译:磁盘图索引因高性价比和高效性能,被广泛应用于十亿级近似最近邻搜索(ANNS)。然而,现有系统通常采用将向量与图拓扑共置的耦合存储架构,导致索引更新时产生大量冗余I/O,从而降低动态工作负载下的可用性。本文提出一种解耦存储架构,将重型向量与轻量级图拓扑进行物理分离。该架构通过减少更新过程中的冗余I/O,显著提升更新性能,但会引发ANNS中的I/O放大问题,导致查询效率下降。为在支持高效更新的架构中提升查询性能,我们设计了两种与解耦存储协同优化的技术:首先提出一种相似性感知的动态布局方法,通过在线优化数据放置,使冗余获取的数据可在后续搜索步骤中复用,从而将读放大转化为有效预取;其次设计一种基于分层PQ增强的两阶段查询机制,利用分层PQ快速精准识别候选集,仅对少量候选结果进行原始向量的精确精炼,大幅降低精炼阶段的I/O与计算开销。整体而言,DGAI在实现资源高效更新的同时,保持了低延迟查询。实验结果表明,相比最先进的基线方法,本系统将插入与删除的更新速度分别提升8.17倍和8.16倍,并在混合工作负载下将峰值查询延迟降低67%。