Rethinking ANN-based Retrieval: Multifaceted Learnable Index for Large-scale Recommendation System

Jiang Zhang,Yubo Wang,Wei Chang,Lu Han,Xingying Cheng,Feng Zhang,Min Li,Songhao Jiang,Wei Zheng,Harry Tran,Zhen Wang,Lei Chen,Yueming Wang,Benyu Zhang,Xiangjun Fan,Bi Xue,Qifan Wang

Approximate nearest neighbor (ANN) search is widely used in the retrieval stage of large-scale recommendation systems. In this stage, candidate items are indexed using their learned embedding vectors, and ANN search is executed for each user (or item) query to retrieve a set of relevant items. However, ANN-based retrieval has two key limitations. First, item embeddings and their indices are typically learned in separate stages: indexing is often performed offline after embeddings are trained, which can yield suboptimal retrieval quality-especially for newly created items. Second, although ANN offers sublinear query time, it must still be run for every request, incurring substantial computation cost at industry scale. In this paper, we propose MultiFaceted Learnable Index (MFLI), a scalable, real-time retrieval paradigm that learns multifaceted item embeddings and indices within a unified framework and eliminates ANN search at serving time. Specifically, we construct a multifaceted hierarchical codebook via residual quantization of item embeddings and co-train the codebook with the embeddings. We further introduce an efficient multifaceted indexing structure and mechanisms that support real-time updates. At serving time, the learned hierarchical indices are used directly to identify relevant items, avoiding ANN search altogether. Extensive experiments on real-world data with billions of users show that MFLI improves recall on engagement tasks by up to 11.8\%, cold-content delivery by up to 57.29\%, and semantic relevance by 13.5\% compared with prior state-of-the-art methods. We also deploy MFLI in the system and report online experimental results demonstrating improved engagement, less popularity bias, and higher serving efficiency.

翻译：近似最近邻（ANN）搜索被广泛应用于大规模推荐系统的检索阶段。在此阶段，候选物品使用其学习到的嵌入向量建立索引，并为每个用户（或物品）查询执行ANN搜索以检索一组相关物品。然而，基于ANN的检索存在两个关键局限。首先，物品嵌入及其索引通常在分离的阶段中学习：索引通常在嵌入训练完成后离线构建，这可能导致次优的检索质量——尤其是对于新创建的物品。其次，尽管ANN提供了亚线性的查询时间，但它仍必须为每个请求运行，在工业规模下会产生巨大的计算成本。本文提出多面可学习索引（MFLI），这是一种可扩展的实时检索范式，它在统一框架内学习多面的物品嵌入和索引，并在服务时消除了ANN搜索。具体而言，我们通过对物品嵌入进行残差量化来构建一个多面分层码本，并将该码本与嵌入进行协同训练。我们进一步引入了一种支持实时更新的高效多面索引结构和机制。在服务时，学习到的分层索引被直接用于识别相关物品，完全避免了ANN搜索。在包含数十亿用户的真实世界数据上进行的大量实验表明，与先前最先进的方法相比，MFLI在参与任务上的召回率提升了高达11.8%，冷内容分发提升了高达57.29%，语义相关性提升了13.5%。我们还将MFLI部署到系统中，并报告了在线实验结果，证明了其在提升用户参与度、减少流行度偏见以及提高服务效率方面的有效性。