Embedding-based retrieval (EBR) methods are widely used in modern recommender systems thanks to its simplicity and effectiveness. However, along the journey of deploying and iterating on EBR in production, we still identify some fundamental issues in existing methods. First, when dealing with large corpus of candidate items, EBR models often have difficulties in balancing the performance on distinguishing highly relevant items (positives) from both irrelevant ones (easy negatives) and from somewhat related yet not competitive ones (hard negatives). Also, we have little control in the diversity and fairness of the retrieval results because of the ``greedy'' nature of nearest vector search. These issues compromise the performance of EBR methods in large-scale industrial scenarios. This paper introduces a simple and proven-in-production solution to overcome these issues. The proposed solution takes a divide-and-conquer approach: the whole set of candidate items are divided into multiple clusters and we run EBR to retrieve relevant candidates from each cluster in parallel; top candidates from each cluster are then combined by some controllable merging strategies. This approach allows our EBR models to only concentrate on discriminating positives from mostly hard negatives. It also enables further improvement from a multi-tasking learning (MTL) perspective: retrieval problems within each cluster can be regarded as individual tasks; inspired by recent successes in prompting and prefix-tuning, we propose an efficient task adaption technique further boosting the retrieval performance within each cluster with negligible overheads.
翻译:基于嵌入式的检索(EBR)方法因其简洁性和有效性,被广泛应用于现代推荐系统。然而,在生产环境中部署和迭代EBR的过程中,我们发现现有方法仍存在一些根本性问题。首先,当候选物品库规模庞大时,EBR模型往往难以平衡对高度相关物品(正例)与完全无关物品(简单负例)以及部分相关但不具竞争力的物品(困难负例)的区分性能。此外,由于最近向量搜索的"贪婪"特性,我们对检索结果的多样性和公平性几乎无法控制。这些问题严重制约了EBR方法在工业级大规模场景中的表现。本文提出了一种简单且经过生产验证的解决方案来克服这些挑战。该方案采用分而治之策略:将全部候选物品划分为多个簇,并行运行EBR从每个簇中检索相关候选物品;随后通过可控的合并策略对各簇的顶级候选结果进行整合。这种设计使EBR模型只需专注于区分正例与主要困难负例,同时通过多任务学习(MTL)视角实现进一步优化:每个簇内的检索问题可视为独立任务,受近期提示学习和前缀微调成功经验的启发,我们提出一种高效的任务适配技术,以极小开销显著提升各簇内的检索性能。