Filtered ANN search is an increasingly important problem in vector retrieval, yet systems face a difficult trade-off due to the execution order: Pre-filtering (filtering first, then ANN over the passing subset) requires expensive per-predicate index construction, while post-filtering (ANN first, then filtering candidates) may waste computation and lose recall under low selectivity due to insufficient candidates after filtering. We introduce a learning-based query planning framework that dynamically selects the most effective execution plan for each query, using lightweight predictions derived from dataset and query statistics (e.g., dimensionality, corpus size, distribution features, and predicate statistics). The framework supports diverse filter types, including categorical/keyword and range predicates, and is generic to use any backend ANN index. Experiments show that our method achieves up to 4x acceleration with >= 90% recall comparing to the strong baselines.
翻译:过滤近似最近邻搜索是向量检索中日益重要的问题,但系统在执行顺序上面临困难的权衡:预过滤(先过滤,再对通过子集进行近似最近邻搜索)需要为每个谓词构建昂贵的索引,而后过滤(先进行近似最近邻搜索,再过滤候选向量)则可能因过滤后候选向量不足而导致计算浪费并在低选择度下损失召回率。我们提出一种基于学习的查询规划框架,该框架利用从数据集和查询统计信息(如维度、语料库规模、分布特征和谓词统计)导出的轻量级预测,为每个查询动态选择最有效的执行计划。该框架支持多种过滤类型,包括分类/关键词谓词和范围谓词,并且通用性强,可与任何后端近似最近邻索引结合使用。实验表明,与强基线方法相比,我们的方法在保持召回率≥90%的同时,最高可实现4倍的加速。