Retrieval plays a fundamental role in recommendation systems, search, and natural language processing by efficiently finding relevant items from a large corpus given a query. Dot products have been widely used as the similarity function in such retrieval tasks, thanks to Maximum Inner Product Search (MIPS) that enabled efficient retrieval based on dot products. However, state-of-the-art retrieval algorithms have migrated to learned similarities. Such algorithms vary in form; the queries can be represented with multiple embeddings, complex neural networks can be deployed, the item ids can be decoded directly from queries using beam search, and multiple approaches can be combined in hybrid solutions. Unfortunately, we lack efficient solutions for retrieval in these state-of-the-art setups. Our work investigates techniques for approximate nearest neighbor search with learned similarity functions. We first prove that Mixture-of-Logits (MoL) is a universal approximator, and can express all learned similarity functions. We next propose techniques to retrieve the approximate top K results using MoL with a tight bound. We finally compare our techniques with existing approaches, showing that MoL sets new state-of-the-art results on recommendation retrieval tasks, and our approximate top-k retrieval with learned similarities outperforms baselines by up to two orders of magnitude in latency, while achieving > .99 recall rate of exact algorithms.
翻译:检索在推荐系统、搜索和自然语言处理中扮演着基础性角色,其核心任务是在给定查询的情况下,从大规模语料库中高效地找到相关项。点积因其可通过最大内积搜索实现高效检索,已被广泛用作此类检索任务中的相似性函数。然而,最先进的检索算法已转向采用学习得到的相似性函数。这类算法形式多样:查询可以用多个嵌入向量表示,可以部署复杂的神经网络,可以通过束搜索直接从查询中解码出项目ID,并且多种方法可以组合成混合解决方案。遗憾的是,我们目前缺乏针对这些最先进设置的高效检索解决方案。本研究探讨了基于学习相似性函数的近似最近邻搜索技术。我们首先证明了混合逻辑模型是通用近似器,能够表达所有学习得到的相似性函数。接着,我们提出了使用混合逻辑模型进行近似前K项检索的技术,并给出了严格的误差界。最后,我们将所提技术与现有方法进行比较,结果表明混合逻辑模型在推荐检索任务上取得了新的最优性能,并且我们基于学习相似性的近似前K项检索在延迟方面比基线方法提升了多达两个数量级,同时实现了超过0.99的精确算法召回率。