Approximate k-Nearest Neighbor (AKNN) search is widely used in vector databases. When vectors carry additional attributes (e.g., labels or numerical values), filtered AKNN search retrieves the nearest vectors to a query vector under attribute constraints. Most existing methods use a fixed termination condition, searching the entire index while respecting attribute filters. However, this leads to substantial redundant computations, since different queries require different amounts of search effort, and thus misses early termination opportunities for easy queries. This paper proposes a lightweight model to estimate the search cost of filtered AKNN queries and enable adaptive termination: For easy queries, the search stops early to reduce latency, while for hard queries, it continues longer to preserve accuracy. The key challenge is accurate cost prediction under attribute filters. To address this, we show that information collected during an early probing phase (e.g., attribute distributions and intermediate distance statistics) can effectively predict the overall search cost. Experiments on six real-world datasets demonstrate 1.1-3.7 speedup over state-of-the-art baselines at 95% recall, while maintaining search accuracy.
翻译:近似k近邻(AKNN)搜索在向量数据库中广泛应用。当向量携带额外属性(如标签或数值)时,过滤AKNN搜索需在属性约束下检索与查询向量最近的向量。现有方法多采用固定终止条件,在考虑属性过滤的同时搜索整个索引。然而,由于不同查询所需的搜索代价不同,这会导致大量冗余计算,错过简单查询的提前终止机会。本文提出一种轻量级模型,用于估计过滤AKNN查询的搜索代价并实现自适应终止:对于简单查询,提前终止以减少延迟;对于困难查询,延长搜索以保持精度。核心挑战在于属性过滤下的准确代价预测。为此,我们证明早期探测阶段收集的信息(如属性分布和中间距离统计)能有效预测整体搜索代价。在六个真实数据集上的实验表明,在95%召回率下,该方法相比最先进的基线实现1.1-3.7倍加速,同时保持搜索精度。