With the surging popularity of approximate near-neighbor search (ANNS), driven by advances in neural representation learning, the ability to serve queries accompanied by a set of constraints has become an area of intense interest. While the community has recently proposed several algorithms for constrained ANNS, almost all of these methods focus on integration with graph-based indexes, the predominant class of algorithms achieving state-of-the-art performance in latency-recall tradeoffs. In this work, we take a different approach and focus on developing a constrained ANNS algorithm via space partitioning as opposed to graphs. To that end, we introduce Constrained Approximate Partitioned Search (CAPS), an index for ANNS with filters via space partitions that not only retains the benefits of a partition-based algorithm but also outperforms state-of-the-art graph-based constrained search techniques in recall-latency tradeoffs, with only 10% of the index size.
翻译:随着神经表示学习的进步推动了近似最近邻搜索(ANNS)的日益普及,能够处理附带一组约束的查询已成为一个备受关注的领域。尽管学界近期提出了几种用于约束ANNS的算法,但几乎所有方法都侧重于与基于图的索引(在延迟-召回率权衡中达到最先进性能的主流算法类别)集成。在本工作中,我们采取不同思路,专注于通过空间分区而非图结构来开发约束ANNS算法。为此,我们提出了一种名为CAPS(约束近似分区搜索)的索引结构,它通过空间分区实现带过滤的ANNS,不仅保留了基于分区算法的优势,还在召回率-延迟权衡上超越了最先进的基于图的约束搜索技术,且仅需其10%的索引大小。