We present a geometric framework for filtered approximate nearest neighbor (ANN) search. Filtering a proximity graph by a metadata predicate produces a subgraph, a fiber, whose connectivity and geometry can differ sharply from the full graph. Using local signals, we propose a two-phase search algorithm that combines full-graph exploration with filtered-neighbor descent when the local geometry is favorable. These signals also classify search failures into three regimes: topological cuts, geometric folds, and genuine basins. A key observation is that all three share a common resolution: restarting the search in a fiber-present cluster near the query. To support this, we introduce a lightweight anchor structure that identifies such regions and restarts the search accordingly. We show empirically that the method outperforms FAISS HNSW on filtered search and the three failure regimes separate cleanly and shift predictably with filter selectivity.
翻译:我们提出了一种用于过滤近似最近邻(ANN)搜索的几何框架。通过元数据谓词对邻近图进行过滤,会产生一个子图(即纤维),其连通性和几何特性可能与原图显著不同。基于局部信号,我们提出了一种两阶段搜索算法,当局部几何结构有利时,该算法将全图探索与过滤邻域下降相结合。这些信号还将搜索失败分为三种情形:拓扑切割、几何折叠和真盆地。一个关键观察是,这三种情形共享同一种解决方案:在与查询点相近的纤维存在聚类中重新启动搜索。为此,我们引入了一种轻量级锚点结构来识别这类区域并相应重启搜索。实验表明,该方法在过滤搜索中优于FAISS HNSW,且三种失败情形能够清晰分离,并随过滤选择性呈现可预测的偏移。